Research data life cycle

Sharing

In the era of FAIR (Findable, Accessible, Interoperable and Reusable) and Open science, datasets should be made available to the public. Whenever possible, discipline-specific repositories (with or without controlled access) should be used in order to increase the FAIRness of your research outputs. If no such repository is available, there are general purpose repositories.

Finding a suitable repository type


Repository overview

  

Does your data contain personal or sensitive information that cannot be fully anonymised?

There may be cases where openly sharing data is not feasible due to ethical or confidentiality considerations. Depending on what the ethical board approving your study said about data sharing, and the level of permission granted from participants, it may still be possible to make your data accessible to authenticated users via a controlled-access repository.

See more guidelines on sharing human data

Is there a discipline specific repository for your dataset?

Research data differs greatly across disciplines. Discipline-specific repositories offer specialist domain knowledge and curation expertise for particular data types. Using a discipline-specific repository makes your data visible to others in your community.

Does your institutional repository accept data?

Many institutions offer support to their employees for managing and depositing data. Institutional repositories that accept datasets provide stewardship, helping to ensure that your dataset is preserved and accessible.

Answered no on all questions above?

General data repositories accept datasets regardless of discipline or institution. These repositories support a wide variety of file types and are particularly useful where a discipline-specific repository does not exist.

There are several repositories for life science data types, a selection hosted by EMBL- EBI is found below.

 

Genomics data

Click on the buttons below for specific information regarding suitable repositories for sharing genomics data.

The European Nucleotide Archive hosts an instance of the Sequence Read Archive (SRA), the same archive that exists on NCBI. SRA accepts raw sequence data from any sequencing platform, generated in any research project. There are several ways to submit data to ENA, for more information see the documentation. We also recommend the following two specialised guidelines:

For convenience, we have created templates for the most frequent data types and their corresponding ENA checklists. The templates come with instructions on how to do an interactive submission, via the ENA Webin Portal, but even when doing a programmatic submission, the template can be useful for collecting all necessary descriptions / metadata. Download an appropriate template, and fill in the sheets according to the instructions in the template:


ArrayExpress is tighty integrated with ENA and similar to NCBI’s Gene Expression Omnibus database it can be used to archive experimental designs and analysis files based on the raw sequence reads. ArrayExpress has its own submission portal where information is available on what can be submitted and how.


European Genome-phenome Archive is a service for sharing personally identifiable genetic and phenotypic data resulting from biomedical research projects. The repository is hosted by the European Bioinformatics Institute (EMBL-EBI) and the Centre for Genomic Regulation (CRG). Any data submitted to the repository is subject to controlled access, which means that access to the data only will be granted after a formal application procedure.

FEGA Sweden is a repository for storing and sharing personal identifiable genetic and phenotypic data in Sweden in a way that meets the requirements of the General Data Protection Regulation (GDPR). It is part of a federation of national nodes, Federated European Genome-phenome Archive, which is tightly connected to the European Genome-phenome Archive (EGA). FEGA Sweden is not yet operational, but researchers may express their interest in depositing data to the repository by filling in a web form.

 

Imaging data

Depending on the type of image data you have, different public repositories are available, please see the table at BioImage Archive.

 

Metabolomics data

MetaboLights is a database for Metabolomics experiments and derived information. The database is cross-species, cross-technique and covers metabolite structures and their reference spectra as well as their biological roles, locations and concentrations, and experimental data from metabolic experiments.

 

Proteomics data

The ProteomeXchange Consortium provides globally coordinated standard data submission and dissemination pipelines involving the main proteomics repositories.

PRIDE admits protein and peptide identification/quantification data with the accompanying mass spectra evidence and any other related data types. Submission is done using the PX Submission Tool, see tutorial.

PeptideAtlas admits SRM/MRM data that does not fit into PRIDE (targeted datasets). Submission is done via PASSEL.

 

Other data

Guidance on where to publish COVID-19 and Pandemic Preparedness research data, can be found on the Swedish COVID-19 & Pandemic Preparedness Data Portal.

For other domain-specific repositories, see e.g. ELIXIR Deposition databases, Scientific Data recommended repositories, EBI archive wizard (help to find the right repository depending on data type), or FAIRsharing (the latter can also assist in finding metadata standards suitable for describing your datasets).

For datasets that do not fit into domain-specific repositories, use a general repository e.g. SciLifeLab Data Repository, Figshare and Zenodo.

The SciLifeLab Data Repository, powered by Figshare and supported by SciLifeLab and the Knut and Alice Wallenberg foundation through the Data-Driven Life Science (DDLS) program, is a repository for publishing any kind of research-related data, e.g. documents, figures, or presentations. Figshare is an open data repository used by researchers in numerous disciplines. Through an agreement with Figshare, SciLifeLab offers researchers and units the opportunity to upload and publish their research data through a dedicated portal.

Zenodo is a general-purpose repository operated by CERN. It can be used for sharing basically any kind of data, but also for just describing data stored elsewhere. Zenodo doesn't enforce standardised descriptions of data, so datasets described there might be more difficult to find than those described in the two repositories mentioned above.

 

How can SciLifeLab help you sharing data?

If you are a researcher at a Swedish academic institution working in the life sciences, you can get help from SciLifeLab Data Management support team. This support team can help you describe and deposit your data. Here are a few examples of the support that is offered:

  • Plan data submission
  • Identify suitable repositories
  • Assist during the submission process when publishing data and code
  • Assist in the creation of metadata records in SciLifeLab Data Repository
  • Advice on what needs to be done when working with sensitive human data
  • Advice on describing data with proper metadata

If you need any help connected to data submission, please contact us!

Resources

Please find below resources concerning the research data life cycle phase share in form of training, guidance and/or tools.

Training resources

Guiding resources