Research data life cycle


During this phase all necessary data to be analysed in the project is collected, either by generating new datasets or by reusing earlier collected datasets. This phase lays the foundation of the quality of both the data and the accompanying documentation. Hence, it is important that quality measures are implemented and that all steps of collection is appropriately recorded.


Data documentation should clearly describe how the data was collected, so that someone else can understand and correctly interpret the data. Make use of electronic lab notebooks (often offered by the university / institute) and metadata standards, and name and organise the files produced appropriately.

Data producers

Click on the data type buttons below to see SciLifeLab units who offer data generation services:

Please find below a selection of SciLifeLab Genomics services:

  • National Genomics Infrastructure (NGI) offers an infrastructure equipped with a comprehensive range of technology platforms for next generation sequencing (NGS) and genotyping.
    • Whole-genome sequencing (human)
    • RNA sequencing
    • Functional genomics & Epigenomics
    • De novo genome sequencing
    • Metagenomics
    • Single-cell genomics
  • Eukaryotic Single Cell Genomics (ESCG) offers high-throughput single cell transcriptomics services.
  • Microbial Single Cell Genomics provides customized, low to high-throughput experimental services for Swedish and international researchers working with prokaryotic and eukaryotic microbes.

Please find below a selection of SciLifeLab Bioimaging and Molecular Structure services:

  • Advanced Light Microscopy (ALM) unit give support with advanced fluorescence microscopy for nanoscale biological visualization, single molecule spectroscopy measurement and analysis with fluorescence correlation spectroscopy (FCS), as well as combined with superresolution dynamical studies (STED-FCS). Moreover, light-sheet fluorescence microscopy support allow users to image live and/or optically cleared larger samples.
  • Cryo-EM offers access to state-of-the-art equipment and expertise in single particle cryo-EM and cryo-tomography (cryo-ET).

Please find below a selection of SciLifeLab Metabolomics services:

Please find below a selection of SciLifeLab Proteomics services:

  • Global Proteomics and Proteogenomics offers proteomics information combined with sample specific genomic and transcriptomics information.
  • Chemical proteomics is a national unit expert on supporting drug discovery and development by proteome-wide deconvolution of targets and action mechanisms of small molecules.

Also available is Biological Mass Spectrometry (BioMS) national infrastructure, which enables cutting-edge mass spectrometry and related advanced technology platforms.


The PI, and his/her academic institution are ultimately responsible for the data, and ensuring that all data is backed-up is essential. The 3-2-1 rule of thumb means that there should be 3 copies of the data, on 2 different types of media, and 1 of the copies at different physical location. This means that even if all the projects research inputs and outputs are located at a backed-up resource, a (third) copy of the data should be maintained.

At least essential data, such as raw data and other data that may be difficult or even impossible to recreate in case of corruption or loss, should be copied off-site (using e.g. Swestore or storage provided by the institute).

Consider uploading the raw data to a repository already when receiving them, under an embargo (if it is important that the data remains private during the project). This way there is always an off-site backup with the added benefit of making the data sharing phase more efficient.

Resources & Training