XClose

Advanced Research Computing

Home
Menu

Data Safe Haven consultancy service

The Data Safe Haven (DSH) has recently undergone a transformation, with the addition of many new tools and features to help researchers work productively within the setting.

Part of UCL’s investment in the DSH includes the provision of advice on how to make the most of the resources now available.

If you would like to arrange a consultation with one of the Research Software Development Group’s DSH experts, please contact: rsdg-dsh@ucl.ac.uk.

New features

  • Artifactory, for installation within the DSH of most popular R and Python packages available on CRAN, Conda and PyPI
  • RStudio and Jupyter, for research software development in R and Python
  • Gitlab, for keeping track of changes to your code, and running CI/CD pipelines
  • Anaconda, for managing Python package dependencies

For full details, please see the documentation.

New hardware

The DSH now also includes an HPC cluster, for running more computationally intensive analysis, including on GPU-enabled compute nodes, powered by Linux and the Son of Grid Engine batch scheduler. If you are familiar with UCL’s Myriad cluster, the transition to the DSH cluster should be straightforward. For more information, please see the documentation.

The full picture

As detailed in the diagram below, researchers now have the option of either running their analysis on RStudio or Jupyter in the Windows virtual machines accessed via the remote desktop, or to use the Windows virtual desktop to run analysis on the DSH Linux cluster.

This can be achieved either by logging into RStudio or Jupyter on the cluster via the browser or using SSH to access the cluster via the command line.

The login nodes are useful for exploratory data analysis, prototyping research software on a subset of your full data, and the visualisation of results. But full analysis should be moved over to the compute nodes, to make the most of the resources available and to avoid causing problems for other UCL researchers trying to log into the system.

Diagram of DSH components

What can the consultation service help with?

We’re here to offer guidance on how to get the best out of the tools outlined above, including:

  • Research software development best practices with git and Gitlab
  • Building data pipelines
  • Code optimisation
  • Implementation of machine learning algorithms, including on GPU resources

Some things we cannot change

The DSH exists to comply with strict information governance standards, including ISO27001. This means that access to the system will always be via a secure remote desktop environment, such as Citrix, with two-factor authentication an essential requirement and copy-paste disabled.

All machines within the DSH will continue to have their internet access disabled, meaning among other things that:

  • Gitlab within the DSH cannot pull directly from an existing git repository outside the DSH, e.g. on Github – although you can zip a cloned repository on your local machine and upload it to the DSH via the file transfer portal
  • The installation of a minority of packages on Artifactory will fail because one of their dependencies tries to access a file over the internet – if this happens to you please email dsh-support@ucl.ac.uk to make a request for your package to be manually whitelisted