Research Data

Best practices for Research Data Management

To plan and organize data collection is essential for scientific success, it will decrease the probabilities of an unexpected occurrence, for example, loss of data, errors, data misuse or, just as important as, it is a way to follow specific instructions from research funders.
When data is organized following a plan and shared, it supports transparency and openness as well as the increase of the ROI for publicly funded research. Some of the advantages of sharing data are (1) the reinforcement of verification and replication of the original results, (2) the promotion of new research, (3) collaborations, (4) avoid duplicate data collection and the spread of fraudulent data, (5) enhances visibility and citation and (6) preserves data for future use.

 
How can you organize and manage your data for publication?

Open science democratizes access to scientific knowledge and thus enhances research development. Research Data Management is a fundamental requirement for validation and reproducibility of scientific results. Moreover, funders are aligning their position with the mantra “As open as possible, as close, as necessary.”, to ensure transparency and openness.
The data collected and used to validate a scientific project should follow the FAIR principles, an acronym for findable, accessible, interoperable, and reusable.

 

 

Creating a DMP (Data Management Plan)

In the scope of H2020 projects and FCT grants, a detailed Data Management Plan (DMP) should be submitted with other requirements about the project. A DMP is a document that details in advance how the data for a specific project will be created, collected, stored, documented and who will oversee the preservation for long-term usability.
Although a DMP is not a static document and will suffer adaptions during the research process, the DMP will serve as guidelines to follow best practices for Research Data Management.

The Digital Curation Center provides a guiding checklist with questions and tips ranging from:

  • Administrative data;
  • Data collecting;
  • Documentation and metadata;
  • Ethical and legal issues;
  • Storage and replications;
  • Selection and preservation;
  • Data sharing;
  • Resources and responsibilities.

When creating a DMP, it is fundamental to organize and document data in a systematic way to facilitate future preservation and long-term storage. This means that you should be thinking in advance about formats and file names according to the instructions of the repository where the dataset is going to be archived.

To help researchers with the process of creating a DMP aligned with the funder’s requirements there are available online some tools:

Common topics in a DMP required by funders are the description of the data (content, type, format, volume), which methodology was followed when collecting the data, ethical and intellectual property of the data, data sharing plans (how, when and who), preserving long-term strategies.

 

Sensitive Data

In 2018 the GRDP (General Data Protection Regulation) became official in the EU, due to this there are a few steps that you should be following to guarantee to avoid any infringement or penalty. Information that allows the identification of a person includes PII (Personally Identifiable Information), PHI (Protected Health Information) and Sensitive Information which should be transformed and anonymized/ pseudonymized, encrypted and archived with a closed licence.

For more guidelines on sensitive data check the OpenAire Factsheet about Personal Data and Open research data.

 
License and Archive

When the dataset is ready for storage in an archive you should choose the licence that better fits the nature of your dataset. If it is sensitive data, you should archive it with a closed license. If the research data are classified as literary work or open software, usually it is applied a license CC BY 4.0. The attribution of a license of CC BY-SA (Share Alike) is also compatible with Open Access policies and by Science Europe with Plano S.

Finally, the research data should be archived in a trustable repository: institutional, specific from the discipline of research, or a general repository. To find a repository for your research data, search first for a disciplinary repository, otherwise, look for institutional repositories that guarantee long-term preservation or use a general repository as Zenodo. Search for repositories that adapt to your needs in re3data.org.