Data sharing FAQ

Who can apply for Whitehall II data?

The WII data are available as a resource for the scientific community to maximize the value of the data for research and eventual patient and public benefit. 

You only need to apply if you would like to use individual level data. For aggregate data, please contact the researcher you wish to collaborate with.

The applicant and their collaborators must be bona fide researchers with an established scientific record, who must conduct high quality, ethical research when using WII data.

The main applicant must be the principal investigator of the research project.

What are individual level data and aggregate data?

Individual level data refer to datasets that contain records with information about individual study participants, as opposed to aggregate data, which are  data combined from several individuals. 

When individual level data are aggregated, groups of observations are replaced with summary statistics based on those observations.

All individual level data available to users are anonymised.

What does bona fide research mean?

Data sharing at WII involves an application process for bona fide researchers with an established scientific record.  We follow the MRC definition of bona fide research (page 24). We will accept applications from bona fide researchers who:

i) conduct bona fide research. This involves high quality, ethical projects for research purposes using rigorous scientific methods. There must be an intention to publish the research findings for wider scientific and eventual public benefit, without restrictions and with minimal delay.

ii) have a formal relationship with a bona fide research organisation, which is an established academic institution, research body or organisation with the capability to lead or participate in high quality, ethical research. It is not a requirement that research is the primary business of that organisation, or that the organisation is publicly financed. In this context, a public-private partnership may qualify as a bona fide research organisation.

Why do I really need to fill in an application form, even if I am a close collaborator?

Our data sharing model follows a gated-access approach. This model, which involves an application process and the payment of a modest fee, is intermediate between the download of all data ever collected in the study (open access model) and a strict model where only a small number of close collaborators are allowed to use the data (restricted access model).

This model ensures that the reputations of the funding bodies, the WII participants and the research team are not compromised through unethical, premature or opportunistic analysis. 

What do I need to do to apply for Whitehall II data?

Access to Whitehall II data through DPUK is free of charge. Visit the DPUK website and follow the instructions in order to complete the application form and the list of the variables names. You need to fill in the application form and the list of the variables names that are on the Whitehall II data sharing page. 

If you propose to collaborate with a member of the WII research team, please complete the application form and the list of the variables names that are available on the Whitehall II data sharing page. 

Once your application is approved, you will need to send the signed data sharing agreement (a scanned copy via email will be sufficient)  and proceed with the payment of £500+VAT.

Please take the time to read the data sharing policy before applying. There is a summary of the terms and conditions on the last page.

I am a PhD student, can I apply for data?

Yes, but the main applicant should be the Principal Investigator or the PhD supervisor. We will offer the usual level of support, so no additional supervision should be expected from WII researchers.

When will the latest wave of data collection be available?

The data collected at one wave will be available as soon as the data collected at the next wave are ready for internal analysis. For instance, the data from phase 9 were available for data sharing in 2014, which is when the data from phase 11 were curated and ready for internal analysis.

How long does it take to receive the data?

It depends on when the application is received.

The WII Research team meets once a month to discuss both the data sharing applications received via DPUK and collaborative applications involving members of the WII research team. Applications need to be received at least a week before the meeting. An application received just a few days before the meeting will only be discussed at the meeting one month later.

We will aim to release the data within two to four weeks of the receipt of the signed Data Sharing Agreement and of the finalised list of variables.

Who should I contact if I have any queries?

Applicants requesting data through DPUK should refer to information on the DPUK website. For collaborative projects, a WII Contact Researcher will be the point of contact for the duration of the project. If possible, s/he will be the preferred WII Contact Researcher named on the application form. 

Could my project be turned down?

The main reasons why the WII research team could request re-submission of a data sharing application are:

- Concerns that the project is unclear or vague.

- Concerns that the applicants might not be bona fide academic researchers with an established scientific record.

- Concerns that the project could damage the reputation of the WII participants, UCL and/or funding bodies through unethical, premature or opportunistic data analysis.

- Concerns that the scope of the project largely exceeds the recommended scope of 1-2 publications within the first 2 years.

- The project proposal heavily overlaps with on-going funded WII projects.

Can I request extra variables?

We recognise that further variable requests might be necessary once the preliminary analysis of data has started. For these extra requests, please contact DPUK or the WII Contact Researcher.

Major changes in exposures and/or outcomes will require a new application.

How do you ensure that the released data are anonymised?

Both the Whitehall II data deposited in DPUK and the data used by the Whitehall II researchers does not contain any individual identifiers such as names, addresses, telephone numbers, NHS numbers, etc. In addition, we apply extra restrictions to the data released to external WII data users. We use two basic tools to achieve this:

1. Participants are identified using an anonymised ID that will be different for each WII data sharing project. The purpose is to avoid linkages that could potentially result in the identification of individuals.

2. Variables of sensitive nature are removed to minimise the risk of participant identification (e.g. complete dates of birth and death). We also remove variables with low prevalence rates (e.g. medical events dates and diagnoses, job title, Civil Service department, etc), as they could potentially enable the identification of subjects with unusual characteristics.

However, given the nature of the data, it is virtually impossible to prevent the identification of specific individuals were one minded to do so. Therefore, it will be the responsibility of data users to ensure that no participant's identity is disclosed under any circumstances.