To be able to use data in the long run, it is important to accompany the data with rich metadata and documentation. Not only to be able to reuse the data, but also during the research project, for example when new colleagues join the project team.
To be able to analyse the data, it is important to clearly describe the goals and context of data collection, in order to prevent that data is interpreted incorrectly. Therefore, it is important to annotate the research concepts, measurements and variables in a clear way. For reuse of the data, it is important to document who collected the data and under which conditions the data can be reused.
The following elements are important:
Metadata about the data set:
Metadata can be applied to the dataset (such as Dublin Core) but also to the variable level. For FAIR data, it is crucial that a dataset has good metadata about the variables in the dataset, to be able to interpret the different values of the data.
The use of metadata standards, or controlled vocabulary, for the metadata on variable level, makes it easier to combine different datasets into a larger dataset. This is often done in the field of data science.
Often, supplementary materials are necessary to be able to interpret and analyse the data. For example the questionnaire or a description of the context in which the data are collected. This can often not be interpreted from the raw data files but should be documented explicitly.
Examples of supplementary documentation:
To make data truly FAIR, it's important that metadata can be read by both humans and computers (machine-readable). The findability of the data will be further increased by using metadata standards. These are agreements within a (research) community about the structure of a dataset, how information is coded and how the content of data should be interpreted. A well-known metadata standard is Dublin Core Metadata Element Set.