Observing the cataclysmic universe: a research data case study
3 July 2013
Our research data service allows safe storage and access for large volumes of research data. UCL astrophysicist Dr Dugan Witherick explains how the service will positively impact his work.
Dr Witherick's research uses data produced by the European Extremely Large Telescope (E-ELT) to study 'cataclysmic variable stars'. These are binary star systems comprising white and red dwarf stars in close proximity to each other.
Over the lifetime of the three-year project, the amount of data produced by the telescope is approximately 50 terabytes (TB). This presents a significant challenge in terms of data management and storage. A robust networking and storage system is essential if this data is to be stored and shared with members of the international group working on the project.
What we can do
Research Data Services (RDS) will provide free advice and consultancy to Dr Witherick - helping identify:
- the best data storage solution for his requirements
- other data services not provided by RDS
- how best to coordinate these services into a workflow.
From these discussions, RDS can provide Dr Witherick with as much as 95 TBs of data storage. This covers the original data from the E-ELT, as well as additional data that will be generated through analysis and simulation.
Each member of the international project team will receive login details, so that they can access the data and collaborate on analyses.
A variety of data transfer protocols are available for moving data from one site to another - in this case, from the E-ELT site to the UCL data centre. The RDS team at UCL test each of these so that they can recommend the most appropriate methods. In this case RDS would recommend GridFTP, which is specifically designed to deal with large data files and is both faster and more secure than classic FTP.
The UCL data centre also has a direct connection into Legion, UCL's high performance computing facility, to accommodate the data analysis required for this project. This means that data can be transferred to and from Legion more quickly than going over the normal UCL network (and without adding more traffic).
Likely impact of the work
Dr Witherick is currently spending a significant amount of time setting up and managing his own data servers. RDS will relieve this burden by taking responsibility for data storage. The dedicated RDS support team ensure that any problems with the system are dealt with rapidly. RDS are also able to provide this service at a considerably lower cost than setting up an equivalently sized data store for a single project.
Through integration with UCL's existing infrastructure, access to the data for the project group will be managed through the central authentication service. This makes it easier to take advantage of other local services such as Legion.
When the project comes to an end after three years, RDS can offer Dr Witherick 500GB of archive space for the project so that the key data (that isn't available elsewhere or can't be easily reproduced) can be preserved for future use.