Information Services Division


Case studies

Proof of concept case studies for new capabilities.

Case study 1: Critical Care Health Informatics Collaborative (CC-HIC)

The CC-HIC is a multi-centre UK intensive care research collaboration. The CC-HIC aggregates physiological data that are stored in the UCL DSH. Open-source software tools are under development to make these data ‘research ready’ while insuring that patient privacy is protected.
Both software development and data analysis workflows require large amounts of CPU and RAM, using a custom toolchain of predominantly R code which interacts with a PostgreSQL database. The software depends on a number of R packages.

The project delivered a large R Studio server and separate PostgreSQL database, dedicated to the study, using the service hosting tooling. The researchers were able to install their R dependencies via Artifactory and are able to manage their code using GitLab.

“The user experience so far has been wonderful. In many instances I wouldn’t be able to tell the difference between the DSH refresh infrastructure and my own personal machine. Almost everything ‘just works’ ”. – Dr Ed Palmer

Case study 2: Linked Consumer Registers - Consumer Data Research Centre (CDRC)

The CDRC assisted in the evaluation of the new DSH high-performance computing (HPC) component for their Linked Consumer Registers project. This project aims at linking more than 20 consecutive annual public UK registers of electors and several sources of consumer data to create annual updates to a longitudinal profile of the adult residents of almost every domestic property in the United Kingdom. As part of the study, millions of records stored in a PostgreSQL database needed processing with intensive string-matching procedures. The processing could be carried out in a parallel fashion and due to the volume of data, required large amounts of compute resources.

A compute node was provisioned for the study using the service hosting tooling. Pre-built conda environments were provided according to research requirements. These were served from a dedicated HPC file server, also provisioned with the service hosting tooling. The researchers made user of 64 virtual CPU cores and 256GB RAM. This allowed the study team to perform the processing they required in around six days.

“The analyses that have been performed using the HPC facility would not have been possible in the regular DSH environment”. – Dr Justin van Dijk, Department of Geography

Further information on the Consumer Data Research Centre (CDRC) can be found at: data.cdrc.ac.uk