Anonymisation and Pseudonymisation

This guidance provides a brief overview of the main differences between anonymisation and pseudonymisation, and how this will affect the processing of personal data.

Contents

Anonymisation
Pseudonymisation
Use in research
Applying the motivated intruder test
Onward sharing
Further reading

Anonymisation

Recital 26 defines anonymous information, as ‘…information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable’.
The GDPR does not apply to anonymised information.

Anonymisation is the process of removing personal identifiers, both direct and indirect, that may lead to an individual being identified.

An individual may be directly identified from their name, address, postcode, telephone number, photograph or image, or some other unique personal characteristic.

An individual may be indirectly identifiable when certain information is linked together with other sources of information, including, their place of work, job title, salary, their postcode or even the fact that they have a particular diagnosis or condition.

Once data is truly anonymised and individuals are no longer identifiable, the data will not fall within the scope of the GDPR and it becomes easier to use.

While there may be incentives for some organisations to process data in anonymised form, this technique may devalue the data, so that it is no longer of useful for some purposes. Therefore, before anonymization consideration should be given to the purposes for which the data is to be used.

The ICO’s Code of Conduct on Anonymisation provides a further guidance on anonymisation techniques.

The ICO’s Code suggests applying a ‘motivated intruder’ test for ensuring the adequacy of de-identification techniques.

Pseudonymisation

Pseudonymisation is not the same anonymisation.

Pseudonymisation is defined within the GDPR as “the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information, as long as such additional information is kept separately and subject to technical and organizational measures to ensure non-attribution to an identified or identifiable individual” (Article 4(3b)).

Unlike anonymisation, pseudonymisation techniques will not exempt controllers from the ambit of GDPR altogether. It does however help UCL meet their data protection obligations, particularly the principles of ‘data minimisation’ and ‘storage limitation’ (Articles 5(1c) and 5(1)e), and processing for research purposes for which ‘appropriate safeguards’ are required.

To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments (Recital 26).

Recital 26 provides that “Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person.”

Both the above sections of Recital 26 mean that pseudonymised personal data can still fall within scope of the GDPR.

Use in research

Where 'de-identified' or pseudonymised data is in use, there is a residual risk of re-identification; the motivated intruder test can be used to assess the likelihood of this. Once assessed, a decision can be made on whether further steps to de-identify the data are necessary. By applying this test and documenting the decisions, the study will have evidence that the risk of disclosure has been properly considered; this may be a requirement if the study is audited.

Applying the motivated intruder test

The study needs to consider the nature of the data, such as the rarity of attributes recorded, the size of geographical areas in question and access to other data that could be linked.

For example, a case of a rare condition in a sparsely populated area might be linked with other freely available information, such as social media, to identify an individual.

We suggest involving members of the study team to ensure a wide range of input is captured. Although the test focuses on 'intruder' type threats, you should also consider risks of inadvertent disclosure, possibly due to availability of other sources of data available within the study.

Think about who an intruder might be (internal or external) and what their motivations might be: perhaps a disgruntled employee, or to discredit UCL / the research team / the funder, an investigative journalist etc and what measures are being taken to protect the data from those threats.

Document who was involved in the assessment (roles), what was taken into consideration, what decisions were made and justification for those decisions. At the end, you should be able to arrive at a robust and defensible statement on the risks surrounding the data and your study's approach to addressing those risks.

Onward sharing

In cases where information is to be shared outside of the immediate study, consideration should be given to the context where ‘anonymised’ information is be disclosed.

This is particularly important if the recipient has access to other data that could be linked to re-identify members of the ‘anonymised’ data set. There is further advice in chapter 7 of the ICO's Code of Practice (above): Different forms of disclosure (p36)’