Research involving human data

Research that involves humans or biological samples from humans should take ethical, legal and societal implications (ELSI) into account. This is true also when research involves data about humans. It is important to consider all the different aspects early in a research project, as the consequences of not doing so may be severe.

When planning a research project, it is often useful to think of ELSI with respect to the different stages in the data life cycle. For example: what laws do I need to follow during the data collection phase? Or, how can I maximise data reuse without compromising the personal integrity of data subjects? What concerns you need to address depend on your research questions and the type of data you want to collect (for example if you have personal data or not).

What is personal data?

Any data that directly or indirectly can be linked to a living person is considered personal data under the General Data Protection Regulation (GDPR). This can for example be a person’s name or personal identity number. Different pieces of information, which collected together can lead to the identification of a particular person, is also regarded as personal data. For instance can a street address in combination with a person’s gender in some cases be sufficient to identify a particular person. One should keep in mind that genetic data about a deceased person also may be regarded as personal data under GDPR if the data can be used to identify a living relative of that person.

What is sensitive personal data?

Some personal data is regarded as sensitive, for example data related to health and genetics. This includes all kinds of human genetic data (both RNA and DNA, and both somatic and germline information), and is likely to apply to other kinds of omics data as well.

Aggregated data (like population frequencies or number of sequence reads for a gene) might not be considered personal data (and hence not sensitive personal data), but a decision has to be made on a case-by-case basis.

Sensitive personal data should always be pseudonymised, which means that a particular person cannot be identified from the data unless it is combined with other data that has not been disclosed to the public. A common pseudonymisation procedure is to replace personal identity numbers with artificial identifiers.

It is important to remember that pseudonymised data is still regarded as personal data under GDPR: Even if the data is only referred to by an identifier that is not associated with the individual, and the researchers processing the data are not themselves in possession of the key of how the identifier relates to the individual, the data is still personal data, as the person can be identified indirectly.

Important regulations to follow

Below we list some important regulations to follow when conducting research that involves human data.

Who is responsible for the data?

An important concept in GDPR is the data controller, which is the person or entity that determines the purposes and means of the processing of some personal data.

Many people wrongly believe that the principal investigator in a research project is the controller for the project’s data. In fact, this is practically never the case for Swedish academic research. Instead, the controller is typically the university (or sometimes the hospital) where the principal investigator is employed. That being said, the principal investigator should act as a representative of her institution and is responsible for ensuring that personal data is handled correctly in her projects.

The data controller must be identified before any personal data is processed in the project, which means before the data is being collected.

A controller can decide to use another entity to help process the data. That entity is called a processor. The controller must instruct the processor how the data is to be processed in a legaly binding contract called a data processing agreement, and the processor must be able to show that they adhere to the GDPR when processing data on behalf of the controller (see also Data processing below).

Sharing human data

Data that cannot be used to identify a living person can normally be shared publicly. However, one must first be sure that the data is truly anonymous. This can be a difficult endeavour, especially if the research involves genetic data. When dealing with human data in a project, it is crucial to document every decision that is being made regarding the data in relation to GDPR.

Personal data may be possible to share under some circumstances. Make sure to follow GDPR, the ethical review act and other relevant regulations. See further information below regarding considerations.

The GDPR states that the processing (including storing) of personal data should stop when the intended purpose of the processing has been fullfilled. There are, however, exemptions to this e.g. when the processing is done for research purposes. Also, from a research ethics point of view, research data should be kept to make it possible for others to validate published findings and reuse data for new discoveries. This is also governed by what the data subjects have been informed about regarding the data procesing.

See more guidelines on sharing human data

GDPR considerations

Before embarking on a new project, consider the following:

  • What personal data will be collected and processed?
    • Only collect data that are needed, i.e. ensure to not collect more data than necessary
  • What is the purpose of collecting and processing the personal data?
    • Determine the purpose and stick to it
    • Do not use data for another, incompatible purpose
  • What is the legal basis for processing the personal data?
    • In other words, are there any laws or rules that permit you to collect and process these personal data?
    • Identify the legal basis for processing before processing begins
  • How and when are the collected data processed and used?
  • Have the data processing been reported to the data protection officer?
  • Who is the data controller of the personal data processed in the project?
  • Have data processing agreements been established between the data controller(s) and any data processors?
  • What technical and procedural safeguards have been established for processing the data?
    • Ensure that the data are accurate and up to date
    • Protect collected data
  • Have Data Protection Impact Assessments (DPIA) been performed for the personal data?
  • What is there to show that the individuals concerned have given their consent?
    • Inform in a transparent and honest way
  • What happens with the data after project completion?
    • How long will the collected data be kept? Erase the data when they are no longer needed.

Data Protection Officer (dataskyddsombud)

The role of the data protection officer is to check that the General Data Protection Regulation (GDPR) is complied with within the organisation. If personal data is processed in your research, you should report this to your institute’s Data Protection Officer (DPO).

Article 6 (1) lists under what conditions the processing is considered lawful. Of these, Consent or Public interest are relevant when it comes to research. You should determine what legal basis (or bases) you have for processing the personal data in your project.

Traditionally, consent has been the basis for processing personal data for research, but under the GDPR there cannot be an imbalance between the processor and the data subject for it to be considered to be freely given. In Sweden the use of consent as the legal basis for processing by universities for research purposes is therefore not recommended. Instead, public interest should probably be your legal basis. Note that if your legal basis for processing is consent, a number of requirements exists for the consent to be considered valid under the GDPR. Consents given before the GDPR might not live up to this.

Also note that even if public interest is the legal basis, other laws and research ethics standards might still require you to have consent from the subjects for performing the research.

Data Processing

All processing of personal data must comply with the Principles relating to processing of personal data - Article 5 in the GDPR. According to these principles, to process personal data, the controller must:

  • Identify the legal basis for data processing before it starts
  • Inform in a transparent and honest way
  • Decide the purpose and stick to it
  • Only collect data that is needed
  • Not collect more data than necessary
  • Not use data for another incompatible purpose
  • Erase data when no longer needed (there might be exemptions to this for research data)
  • Ensure that data is correct and updated
  • Protect collected data – confidential and intact

And be able to demonstrate that the GDPR is followed.


  • A Data Processing Agreement is needed when a Processor (someone from a different university than the controller) is processing the data (e.g. storing or analysing) on behalf of the controller.


As a controller you should:

  • Ensure that data processing agreements are established when needed.
  • Ensure that all Processors are informed on what can and cannot be done to the data.
  • Ensure that all processing is done in a compute environment with a suitable level of security, e.g. Bianca at Uppmax.

As a Processor you should:

  • Only handle the data according to the instructions from the controller.
  • In the case of a data breach, accidental or otherwise, immediately report the incident to the controller.

Data Protection Impact Assessment (DPIA)

A Data Protection Impact Assessment (DPIA) is needed if the personal data processing is likely to result in a high risk to individual people’s rights (IMY on Impact assessments and prior consultation). The purpose of a DPIA is to prevent risks before they occur, by identifying what risks exist and draw up procedures to meet those risks. In order be able to decide if a DPIA is needed, you should perform a risk analysis. Analyse what risks your personal data processing may involve and suggest appropriate security measures. Document your findings so that you can demonstrate that you comply with the GDPR. If the risk analysis shows that a DPIA is needed, there are tools to help you e.g. PIA software from CNIL.

Security of processing

To ensure that the personal data that you process in the project is protected at an appropriate level, you should apply technical and procedural safeguards to ensure that the rights of the data subjects are not violated. Examples of such measures include, but are not limited to, pseudonymisation and encryption of data, the use of computing and storage environments with heightened security, and clear and documented procedures for project members to follow.

The security measures taken should be based on an evaluation of the risks for, and consequences of, the personal data not being correct and protected. Appropriate technical and organisational measures shall be implemented to ensure a level of security appropriate to the risk. It is advisable that the researcher seek guidance from the legal and information security functions of the university adminstration about this.

The UPPMAX Bianca system has been designed to have technical and information security procedures that are appropriate for processing sensitive human data for analysis. Using this system then relieves the researcher from having to define these technical and security procedures themselves (at least for the analysis phase of a project). The researcher can decide to analyse (sensitive) personal data elsewhere, but then they will have to define the appropriate procedures. In case the controller work at a different institute than Uppsala University, a data processing agreement between that institute and UPPMAX/Uppsala University, needs to be established - see instructions at UPPMAX.

Ethical considerations

Before embarking on a new project, consider the following:

  • Has the project (or parts of the project) undergone ethical review?
  • Have informed consents been collected from the research subjects?
  • Are there limitations of use defined in these?
  • Is the intended research purpose within the scope of the limitations of use that is defined in the ethics approval(s) and/or the informed consent(s)?

The purpose of these questions is to spell out what uses the subjects have consented to, and/or for what uses ethical approvals have been given. Then, given the stated research purpose of this project, are the consents and ethical approvals for the datasets compatible with this.

Further questions

If you have further questions regarding sensitive personal data, you are welcome to contact the SciLifeLab data management team (


Please find below resources concerning research involving human data in form of training, guidance and/or tools.

Guiding resources