XClose

Advanced Research Computing

Home
Menu

Research Data Storage FAQs

At present, you must be on the UCL network in order to use any part of our service. If you are off site please use the UCL VPN.

We have a selection of guides to help you access and use your storage, please see the RDSS Access Guide.

Registering a project

How do I register a project and what information do I need to provide?

Only the Principal Investigator for the project can register.

Please visit the web administration portal and sign in using your UCL credentials:

Research Data Storage Web Administration Portal

Click “New project” to open the application form.

You will be asked to provide the following information:

  • One or more project administrators: an appointed person who you authorise to act as a point of contact for us to make decisions about adding or removing group members and other decisions about the storage.
  • A list of the project members (other than yourself and the optional administrators).
  • Start and end dates for your project.
  • Volume of data that you expect to have in terabytes. This is an upper limit. If you expect to use less than 1TB, enter 1.
  • If you ask for more than 1TB you will be required to submit further information regarding your intended usage. Additional capacity above 1TB is charged at £50 per TB per year.
  • Project title: We suggest choosing something that is likely to be unique to your project. A title such as “neuroscience” may not adequately distinguish your project from others.
  • Description: A brief description (approximately the length of an abstract for a paper and which may match with what was used in the grant application).
  • Grants: optionally, enter one or more grants that fund the project. Enter each grant on a new line.
  • Agreement for you and your project team to abide by the conditions of use.
What is a project in the context of RDS?

A project (from the Research Data Services perspective) is a body of work requiring data storage that has a start date, an end date, a title, a description and one or more people that are granted access to it. A project will often relate to a particular grant application, but ‘unfunded’ research projects can use the service.

Who can register to use the service?

The owner of project space on RDSS must be a current member of UCL staff and the Principal Investigator (PI) of the project for which the data is associated. The application must be submitted by this person. The PI will be the person who applied for the grant to fund the project or, in the case of ‘unfunded’ research, the person leading the research project. Additional users can be any staff or research student with a UCL username and password. In the case of PhD projects the supervisor should apply on behalf of their student.

Why do you require that the project PI completes the registration form?
  • Though there are usually several people involved in one project, it is useful to have one person as a single point of contact for the group who doesn’t change or disappear at short notice. PIs move around to other institutions less frequently than other members of staff and students
  •  As the most senior member of a research project, the PI needs to accept responsibility that no data is uploaded to the RDSS which would contravene the terms and conditions of the service, particular in respect to its lack of certification for handling sensitive data.
Why do you require an end date for a project?

We are trying to guard against the following situations:

  • Important research data being forgotten about after the conclusion of a project, eventually becoming ‘orphaned’ as researchers move jobs or retire.
  • Unwanted data being left on the service, taking up space that could be better used by others.
  • Data not being moved to more appropriate long-term archive and repository services. Many research funders now require this, and it is useful to have a reminder before a project concludes that it’s time to archive the data.

By applying an end date to a project, it doesn’t mean that research on that topic should come to an end. However, it encourages our users to tidy-up/organise and annotate their data. When decisions have been made about what could reasonably be useful to themselves or to others, this data can be publicly archived and the rest discarded.

Why do I need to provide the title and description of the project?

Knowing what our users are working on helps us to make decisions about which storage technologies we should be investing in for the future. Occasionally a project description can alert us to special considerations that we may be able to advise on.

As part of the normal Research Data Repository service this information may be harvested into the metadata for the archive, saving you the effort of having to enter the same data again.

Do I need to justify my usage of the service in the description?

Provided that your project falls within our guidelines, you will be allocated space on our live storage service. It is not necessary to qualify the value of your research to us, it’s not like applying for a grant!

Can I register more than one project?

Yes, but it should be clear that the purpose is different. There should be a distinct title and description.

I’m the PI, but wish to nominate someone else to manage the account, is that possible?

Yes, we have the concept of an administrator. You can nominate one or more administrators during the registration process, and can add or remove administrators at any time using the web administration portal.

An administrator can add / remove members and make other changes to the project. Like the Principal Investigator, they can request changes to storage volume and project end date.

Can I store sensitive data in the RDS?

The RDS takes data security seriously and it should not be possible for unauthorised users to access your project folder. The hardware we use is kept in secure server rooms, and we apply strict user access controls to ensure only those authorised to do so can access the data hosted by the service. That said, we are not at present certified to the ISO 27001 standard nor authorised to host restricted NHS data. We therefore prohibit the use of the Research Data Storage Service for hosting unencrypted personally identifiable information or anything else that falls under the GDPR legislation of the UK Data Protection Act 2018.

As part of our terms of use, the Principal Investigator assumes the responsibility of ensuring that data uploaded to the service is not in breach of the Data Protection Act 2018 or other relevant legal and contractual agreements, including 3rd party agreements.

UCL Researchers working with sensitive data should consider using the Data Safe Haven, which provides a highly-secure ISO 27001-certified facility for data storage and processing.

UCL provides guidance on research data anonymisation in its web pages on Information Governance.

How much space can I have, is there a limit?

We don’t have a specific project size limit per se. So far we have been accepting storage requests up to 5 TB for all projects. For projects requesting larger quotas we will make decisions based on a balance of project size vs the remaining capacity on our storage system.

If your project has the budget to pay for storage beyond 5 TB, we will be happy to discuss this with you.

Do I have to pay?

Use of the RDSS is free for 1TB. If you will need more storage space than this then you will need to pay for the cost of the storage hardware on a per TB per year basis. The current rate is indicated at Research Data Storage Service. This is to ensure the service remains sustainable as data volumes continue to grow. Most major research funders will cover the costs of large-scale data storage in grant funding, and the cost of using the RDSS should be included in project bids.

If you are unable to cover the costs of the storage, please write to us at Research Data Support.

What happens when the project comes to an end?

The RDSS will get in touch with the Principal Investigator and any assigned project administrators shortly before a project is scheduled to complete. If you would like an extension to the project lifespan, that’s fine, although the project will need to pay for extensions when they require greater than a 1TB capacity (at the standard annual rate).

If the project is indeed drawing to a close, any datasets requiring publication or long-term preservation should be moved to an appropriate data repository. If there are discipline-specific data repositories in your field then these should be considered first, otherwise the UCL Research Data Repository is the recommended option. Further information is available from UCL Library Services.

Files that are no longer required after the end of a research project should be deleted, although please consult your research funder’s policies and the UCL Research Data Policy before deleting files to verify that they do not need to be preserved for longer.

Why are there two different types of storage?

During 2015-2017 the Research Data Storage Service was primarily based on an iRODS/WOS data management and storage facility. This did not prove to be as reliable as we would have liked, and required users to install and use a third party software package (Cyberduck) if they wanted to upload or download data using a graphical user interface. In November 2017 we installed a new storage facility based on a type of file system known as GPFS or IBM Spectrumscale. This has proved much more dependable and is the default storage used by all new projects.

The new GPFS facility can be mounted as a local drive, provides better metadata support, and offers optional functionality such as snapshotting. We are in the process of migrating users of the iRODS/WOS facility to the new storage, although this will take some time.

We have now closed write access to the old iRODS/WOS facility and users of that facility who require continued access should request a priority migration.

I would like to transfer multiple projects from my department to RDS; what should I do?

If you have responsibility for supporting the research data storage needs of multiple projects in your department, and would like to talk to us about transferring them to RDS, please get in contact with us at Research Data Support. We will arrange to meet up and discuss this in person. 

Can an undergraduate/postgraduate student register a project with RDS?

No, your PI must register the project.

Can someone from outside of UCL be the PI for a project with RDS?

No 

Can we use the RDSS as a shared storage space for our group?

We strongly advise setting up separate project storage within the Research Data Storage Service for each real-life project you are working on, whether funded or otherwise.

Clearly identifying the project to which data belongs makes it easier to define a point at which a decision needs to be made about what to do with that data in the long term. This might involve discarding, archiving, or publishing the data, but making this decision is an important step in research data management best practice and will keep the costs of long-term storage down.

We understand that sometimes the same data is used across several different projects, and in these cases we would recommend that a separate project is created consisting only of the common datasets, and excluding anything that is specific to an individual project. Please write to us (researchdata-support@ucl.ac.uk) if this is what you need so that we can manage this common project separately in future.

Access and permissions

Can non-UCL collaborators be given access?
Yes they can. Please see the adding and removing project members guide.
Can an undergraduate/postgraduate student have read/write access to a project?

Yes, if the PI nominates them as members. 

Will anyone else be able to access my data?

In principle RDS and on occasion our hardware vendors can access your data. In practice we only look in your project directories if you ask us to or to investigate a problem with the operation of the service. 

To what extent can I control access for different members of my group?

There isn’t much that a PI can do that is privileged compared to the other members of his project group, except for choosing the project members. Also we cannot assign read only access to certain members at present. Our how-to guide explains the options that are available: 

RDS: How to control access for different members of a project

Can I host content from your service so that the public can access it?

Yes, by using the UCL Research Data Repository. This acts as a data publication platform where you can upload data and publish it with a unique Digital Object Identifier. It will then be findable, downloadable (with an embargo period if desired), and citable by others.

Further information about the Repository is available from Research-Data-Repository.

Guidance on using the Repository is available from Research-Data-Repository.

Can I access our storage like a drive on my computer? (i.e. as a mounted drive)
Yes. Instructions for accessing and mounting your storage are provided with the welcome email you should have received when the project space was created . A more thorough guide to mounting your storage is available from Live-Storage-Access-Guide.
 
Can I access the Research Data Service via Desktop @ UCL Anywhere?

Yes. Instructions for accessing and mounting your storage are provided with the welcome email you should have received when the project space was created. A more thorough guide to mounting your storage on windows is available from How-To-Mount-Research-Data-Storage-Service-Windows.

How do I access my data from Research Computing (Myriad, Grace etc)?
How do I access my data from outside of UCL?
What programs can I use to access this service?

A list of recommended programs can be found in our access guide

The underlying access mechanisms that we use for our service are SSH, SCP and SFTP. These are ubiquitous protocols and there are many client programs that can connect using them. Please see the bottom of our storage access guide for a list of some programs that can be used.

Support and troubleshooting

Which email address should I use for contacting Research Data Services?

Please email Research Data-Support from your UCL email address to ensure your message reaches us.

Can we talk to you face to face?

You can come along and discuss things or get help with setting up your connection to our service at the regular research IT and data management drop-in sessions. Feel free to drop by with any questions about our services or for general advice regarding research data management.

Alternatively, if you would prefer a house-call, then you can email us at Research Data-Support.

What operating systems and software do you support?

Our service isn’t OS dependant, though within the RDS team we have experience in using Windows, OS X and Linux-based systems. If you wish to use our service with a GUI, the best support is for Windows and OS X, and if you want to use it from a command line, the best support is in Linux.

How do I check my usage and quota?

The easiest way is to log in to the web administration portal, find the project in the ‘My Projects’ list and click the title. This will take you to the project details page where you can see details of storage type, usage and quota.

Our how-to guide explains other options for checking your usage and quota:

How to check your usage and quota on RDS

How do I add/remove members of my project?

Yes, you can add and remove member of your project using the web administration portal.  When a new member is added, they will receive a welcome email with details on how to access the storage.

Can I rename my project?

Yes, you can request the change of title using the web administration portal.  This request will need to be approved from a member of the RDS team.
Please note you will not be able to change the project code(e.g ritd-ag-project-rd1234-abcdef) as this is a unique code that links to our storage system

Can I change the end date for my project once I've started?

Log in to the web administration portal, find the project in the ‘My projects’ list and click the Edit button. Enter a new End Date and click ‘Submit update’. Your request will be sent to Research Data-Support and you will be contacted to discuss your project’s requirements.

Is my data encrypted?

We do not currently encrypt the data on our storage systems. Data is transmitted between our datacentres along optical fibres unencrypted, though these are difficult to intercept without datacentre access.

Encryption over SSH connections to our GPFS facility are enabled by default, though you may wish to select a weaker but faster algorithm if the transfers are too slow (e.g. add "-c arcfour" or "-c none" as an argument in your scp connection command, the latter turns off encryption for data transfers).

Is my data backed up?

Yes, there is a daily backup that runs overnight.

We also use redundancy to safeguard your data.

What kind of data transfer speeds am I likely to get?

It isn’t possible to accurately predict the transfer speeds that you’ll get using our service as it depends on a number of factors that include:

  • current loading on our service
  • underlying speed of the network where you are (local network)
  • current loading on your local network
  • the protocol used (SFTP, SCP etc)
  • the programs you use for file transfer
  • the cryptographic cipher used (an option on some programs)
  • whether you are transferring lots of small files or a smaller number of larger files (each new file transfer entails an overhead, hence one large archive file will transfer quicker than many small files of the equivalent size in kb)

We have seen transfer rates from a few MB/s to a few 10s of MB/s (megabytes per second).

Is there anything I can do to improve the speed of data transfer?

Some suggestions for improving data transfer speed are:

  • Changing the time that the data transfer takes place – local network loading or loading of our system will vary over time. Work hours are likely to be busier than out of hours.
  • If it is practical, you could try connecting to the service from other locations around the university.
  • We have seen that some researchers are on 100Mb/s networks (about 10 megabytes per second), which is rather slow by modern standards. You may be able to encourage your local IT support to upgrade you to a gigabit (1 Gb/s) network.
  • If you are transferring a large number of small files, it may be better to compress them together into a smaller number of archive files such as zip, 7z or tar before transferring the data.
Why do you have separate home and project space directories on the block storage (GPFS) service?

Some people are members of multiple projects. When someone logs in, unless the starting directory is specified, they need to be placed somewhere in the directory tree that relates to the individual, so that they can decide which project they wish to interact with. We decided to opt for the standard where everybody has a small bit of personal space on the system.

I've written about 500MB to my GPFS storage and I'm now getting messages saying no further writing is possible; what's going on?

By default when you first log in, you will be in your home area. The home area only has a small quota set on it as it’s not intended to be used for storing project data. In the registration email that you were sent there should be a location for your shared project directory. This will be found under 

/mnt/gpfs/UCL/

or

/mnt/gpfs/live/

All new projects are being placed under ‘live’.

How will I know what kind of storage my project is on?

It should say in the email you received when you joined our service. Possibly we have moved your project, but this will have been discussed with you in subsequent emails.

Do you offer CIFS/NFS exports of the storage?

Not as a formal part of our service. If you believe it is essential for your needs then please get in contact with us at Research Data-Support and we can discuss this.

As a guide to terminology, CIFS (Common Internet File System) is the technology that Windows uses to connect network drives/folder shares. NFS (Network File System) serves a similar purpose but is aimed more towards Unix-like operating systems such as Linux.

Do you offer block level access to the storage?

No 

What is SSH?

SSH is short for Secure Shell. It provides a way to connect two computers together using command line instructions. This is great if you are comfortable using Linux and want to make changes to your data remotely. If not then you will probably be more comfortable using one of the GUI type applications or mounting directly. Please see our storage access guide

How do I know whether my data is sufficiently anonymised?

UCL provides guidance on research data anonymisation in its web pages on Information Governance.

How do I view my project details?

Log in to the web administration portal, find the project in the ‘My Projects’ list and click the title.

How do I change my project details?

Log in to the web administration portal, find the project in the ‘My projects’ list and click the  Edit button. Most changes will be immediately executed, but changes to storage volume and project end date will be sent to Research Data-Support for approval.

How do I make someone else the PI?

Please send an email to Research Data-Support with the details of the new PI and the reason why they should be assigned the role. They will need to be a current member of UCL staff.

State whether you want to be removed from the project altogether or remain as a member or administrator.

How do I apply for more storage space or an extension to the end date?

Log in to the web administration portal, find the project in the ‘My projects’ list and click the  Edit button. Enter a new End Date and / or volume and click ‘Submit update’. Your request will be sent to Research Data-Support and you will be contacted to discuss your project’s requirements.

Other data management services

Is there somebody who can help me write a data management plan?

One to one support and guidance for data management planning is available from UCL Library Services. See their website for details: Research Data Management.

Where can I store my data when the project comes to an end?

In the first instance, nothing will be removed without notifying you and giving you fair warning. If we haven’t heard from you, we will move your data to a medium term repository facility from which it can be recovered on request.

We have recently deployed the Research Data Repository where you will be able to publish any data you wish to make public. Please see the Research Data Repository website.

Alternative data repositories may also be available to you. See the Research Data Management pages on the UCL Library Service website for guidance.