Storing & preserving data

Data storage and preservation are
key elements in the research data lifecycle. For this reason it is important to
think at the beginning of your research project how and where you are planning
to store and preserve the research data you collect.
Deciding early on which data to keep, which to discard and in which file format will
also inform your decision on where to store your data and to estimate the costs of preserving it, in
the long and short term. Storage and preservation costs should be included in
your funding proposal. Planning ahead means that the unique data you have
collected will be easily found, accessed and re-used by you and other
researchers (if appropriate).
In this guide you will find information about storage, security,
long-term preservation, retention and disposal of data as well as information on sensitive and personal data.
Data storage
Below is a description of the options available to store your research data during your research project (i.e. what is sometimes called "live data"). We recommend to use UCL infrastructure whenever possible.
- Use UCL infrastructure
-
UCL Research Data Storage
If you are about to start or are currently working on your data you should use this infrastructure.
This is a centrally managed, resilient facility for the safe storage of small and large volumes of research data that ensures compliance with both UCL's and funders' policies. Up to 5TB of storage is available free of charge to any project registered by a UCL member of staff.
Guidance is available. You can also email the Research Data Services Team for help.
A UCL archive service for long term preservation of research data is currently in its pilot phase.
UCL Data Safe Haven
If you handle personal data as part of your research project you can use this infrastructure. Information and guidance is available.
UCL N Drive
Storing your data on a UCL networked
drive (N: drive) will ensure daily backup and minimise risks of loss and security
breaches. All students and staff receive 100GB of storage space.
Staff can also store non-personal data on their S: Drive to enable colleagues working on the same project to access the
data.
- Cloud services (not recommended)
-
Many companies offer (relatively) low-cost networked online storage, known
as ‘cloud’ services. Although they are
convenient and easy to use, you should be cautious when considering using these
services, for the following reasons:
- Terms of use may mean that the provider has a right to access or even use
your data. If your data is confidential,
it would need to be protected by encryption, and some providers do not allow
encrypted data.
- Providers do not accept responsibility for corruption, loss or damage to
customers’ data, and there is no guarantee of continuity of service. Providers also have different backup
policies, varying from daily to monthly, and may or may not retain previous
versions for a period, meaning that previous versions of documents may be
overwritten by automatic synching. Files
should therefore be backed up. This
means using cloud services as a backup, or as additional file space, makes no
sense.
- Accounts can be closed down without notification if providers believe they
have been misused. Where some research
data is concerned, this is perfectly possible, and in some cases likely, for
example, data including images of children, or featuring nudity.
At all times you should follow the guidelines contained in the UCL Information Security Policy.
- Portable devices
-
The Information Security team has published guidance on the storage of sensitive data on portable
devices and media. If you think
it is necessary to do this, all data must be strongly encrypted.
- Hard copy records
-
You should keep paper records close at hand within your immediate office
space while you are using them frequently and those you use occasionally
off-site. Off-site storage is managed by
the Records Office
(third party storage services are not permitted) and you do not need to wait
until your study is finished before sending infrequently-accessed records off
site. Local filing rooms or ‘archives’
must not be used.
- NHS data
-
If you are accessing NHS patient data, you should contact the Information
Governance Advisory service to discuss your storage and research ethics
requirements.
Information
Security
As mentioned in previous sections,
keeping your research data secure is very important. There is a number of methods
that you can use, from the most common such as changing your passwords regularly and creating
strong passwords for your devices, to more sophisticated ones such as using
specialist software for encryption of flash-drives or laptops.
Passwords on individual documents,
and saving data to hard drives (i.e. stand alone computers or laptops), are not recommended.
Information security is not limited to protecting existing files, it also
includes data erasure. Deleting files is not enough as tools are available to
retrieve deleted data. You need to make sure that the data you want to discard,
especially in cases of
sensitive data, is completely wiped from hard-drives or portable drives. See secure disposal guidelines in the ISD's Information Security Knowledgebase.
Long-term preservation and ‘archiving’
You should think about what will happen
to the data after the end of your project, where it will be stored, for how
long, and how to make it accessible in the long term. You will also, of course, need to decide what
will be made available, from raw data to final outputs. All of this may be determined, wholly or in
part, by your funder or research council.
- Formats & obsolescence
-
If you have followed the guidance on formats and on naming and version control,
you will have chosen formats on the basis of the future use of the data. Formats will become obsolete over time, and
you should plan for this. You should
also bear in mind, however, that the risk of obsolescence will depend on the
software.
If you have the choice, we recommend to use non-proprietary, open and well-documented formats.
- Options for long-term storage
-
You can store your data in the long term in the UCL environment,
using:
- UCL Discovery: if you need to make publicly accessible small datasets underpinning your publications you can use this infrastructure. Guidance is available.
Email the Discovery support Team for help.
- UCL Digital Collections: if you have finished working on your data you can use this infrastructure. Information is available.
Email the Digital Curation Team for help.
- UCL Records Office: when research
has ended, hard copy (non-electronic) records which must be retained should be sent to the UCL
Records Office. This is the only
approved place of deposit for such UCL’s administrative and research records.
- departmental servers.
Funders and publishers might also have their
own repository or might direct you to deposit your research data in their
chosen repository.
External repositories can be used to preserve your data, for
example re3data.org is a
searchable registry of international research data repositories.
Sensitive and personal information
Ethical and legal issues should
always be considered when storing and preserving your research data. You will need to anticipate questions such as:
- should I encrypt my data?
- who will be able to access my sensitive data?
- do I have the right to store and preserve my data and for how long?
You can find further information in our guide
dedicated to handling sensitive &
Personal Information.
Retention and disposal of all records and data (whether electronic or not)
The UCL
Retention Schedule prescribes how
long records and data should be held. Section 2 deals specifically with research records,
including clinical trials.
- Hard copy records
-
When research
has ended, hard copy records which must be retained should be sent to the UCL
Records Office. This is the only
approved place of deposit for UCL’s administrative and research records.
- Clinical trial records
-
Where clinical trial records are concerned,
the Records Office accepts only:
- UCL Trial Master Files
- UCL Site Files
- Site Files from UCLH NHS
Foundation Trust, Royal Free London NHS Foundation Trust or Whittington
Hospital NHS Trust where the Chief Investigator holds a substantive or
honorary contract with UCL.
Storage of
records which do not fit into these categories is controlled by local Standard Operation Procedure (SOPs) for UCLH and the Royal Free.
The Joint Research Office's SOPs for
the content of trial files and archiving should be followed where applicable. Sponsors’ requirements for retention take
precedence over UCL’s rules, in which case archiving costs should be included
in the full economic costing
early in the approval process.
- Confidential waste, CDs & DVDs
-
Hard copy
confidential waste, CDs and DVDs, must be disposed of via UCL Estates.
Your funders may have their own policy regarding the preservation of data that were collected as part of a project that they funded. Check our list of links to funders’ policies. Your funders’ policies generally take precedence over UCL’s policies.