Sensitive Data and Trusted Research Environments

If you are carrying out research involving sensitive or confidential information, UCL's Centre for Advanced Research Computing is here to help.

We offer technical solutions and expert guidance on maximising the efficiency of your research while maintaining an appropriate level of information security.

What is a Trusted Research Environment?

To meet their legal, contractual and ethical obligations, many researchers working with sensitive data will need to carry out their analysis in a Trusted Research Environment, or TRE, which is a blanket term for a range of software and hardware configurations as well as integrated information governance processes, all of which have the overarching aim of keeping data stored within them safe.

Implementations vary but are likely to involve data and analysis tools being stored on a virtual machine with strictly limited internet access, a system in place ensuring that users are granted access only to the data they need, and a secure remote desktop connection enabling users to connect to their virtual machine.

How sensitive is my data?

Organisations supplying data under licence for research purposes will often specify what information security arrangements are required to work with their data.

If it's not clear what level of security needs to be applied, please refer to UCL’s information classification tool. If you’re still not sure, please get in touch with the Information Governance team: infogov@ucl.ac.uk.

The UCL Data Safe Haven

If you have confirmed that your project does involve data that needs to be stored and analysed within a Trusted Research Environment, UCL’s Data Safe Haven (DSH) may be the most straightforward option.

It is a TRE that is certified to the ISO 27001 information security standard and complies with the principles contained in the NHS Digital Data Security & Protection (DSP) Toolkit, and can be configured to comply with other security protocols if required by your data provider.

The DSH has a number of features and tools to facilitate data analysis and research software development:

Windows virtual machines with RStudio, Jupyter and Stata installed
An HPC cluster, similar to Myriad, running on Linux with R and Python, and GPU capabilities available
Python development environments with package installation from PyPI and Anaconda (via Artifactory)
R development environment with package installation from CRAN (via Artifactory)
Database options including MySQL and PostgreSQL with PostGIS available for geospatial analysis
Gitlab for software version control

For more information on getting started with the Data Safe Haven, please see the Information Governance team’s onboarding guide.

Once your access has been granted, please refer to the DSH user guide and FAQs for full details on how to use the system.

There is also a DSH data science consultancy service available to help researchers working within the environment make the most of the tools available.

DSH demo

A glimpse into how to upload a file to the DSH, log on to the environment, and access the HPC cluster.

MediaCentral Widget Placeholderhttps://mediacentral.ucl.ac.uk/Player/AJIc8Bbh

Other options

There are some use cases that the Data Safe Haven cannot cater for at present, for example projects involving very large image-based datasets. If your project requirements are beyond the DSH’s current scope please get in touch with the Research Software Development Group’s DSH team on rsdg-dsh@ucl.ac.uk.