Guidance on how to share research data and the potential benefits this brings.
Research data frequently remains valuable after the life of the research project for which it was generated. Sharing data can open up new avenues of research and enquiry without researchers having to recreate and collect identical data. Due to this, and other factors, research funders are increasingly requiring researchers to share their data.
Researchers should plan how and when to share their data (if it can be shared), whether there are reasons it can not be shared, and how to ensure that other researchers will be able to make correct use of it.
Within this guide
- Why share?
- Handling copyright, Intellectual Property (IP) and licences issues
- Sharing Data: when & where?
- How to share data and arrange a material transfer agreement
- Citing data
There are many motivations to share research data. Many funders view them as a public good that should be shared with the academic community and beyond.
Benefits of sharing research data
- Increases the academic profile of researchers, by ensuring credit is given to data as a research output in its own right.
- Increases the impact and visibility of research.
- Complies with many funders' requirements, including making best use of investment by avoiding replication.
- Allows data to be independently validated and tested.
- Leads to new collaborations and partnerships.
- Provides great resources for education and training.
It is not always possible to share data, or you may wish to restrict access for some of the following reasons:
- Your data includes sensitive and/or personal information. Please see our guide for more information on handling sensitive and/or personal information.
- You intend to make a patent application or there is potential to generate revenue from your research.
- Your data is the result of collaborative or externally funded research governed by a confidentiality agreement.
The UCL Copyright team have a useful webpage to help you better understand copyright, IP and licences for data.
- UCL IP Policies
According to UCL’s Intellectual Property policy and related guidance for staff and students, UCL holds the copyright (and where applicable, database rights) to certain materials if they were discovered or created by UCL staff in the course of duties. You can find out more information about this on the Copyright for research data and software webpage. Most of the time you will be able to make decisions about where, when and how you will license and share your research data to make it as open and re-usable as possible. By default, UCL students own any IP they generate.
- Copyright and data
There is no copyright in pieces of raw data (i.e. which have been collected or generated in the course of research but have not yet been analysed or manipulated) since copyright protects original creative activity rather than facts.
Any other data are covered by copyright.
A database containing raw data may qualify for copyright protection if the selection and organisation of the data "…constitutes the author's own intellectual creation". (Copyright, Designs and Patents Act 1988) A database which does not qualify for Copyright protection may still be protected by the separate Database right which offers more limited protection to the investment of resources which has gone into collecting the data and assembling it. In the context of UCL's IP Policy, the Database Right is claimed by UCL.
- Licencing your data
Researchers involved in collaborative or sponsored work, including research students contributing to research projects, should reach agreement prior to commencing work as to how data will be licensed, including signing agreements to transfer any rights to a Principal Investigator, institution, sponsor or similar. Templates can be provided; please contact your Research Data Management Team for these.
UCL's Research Data Policy sets out UCL's expectations around the licensing of data created by UCL researchers:
"Unless covered by third party contractual agreements, legislative obligations or provisions regarding ownership, UCL research data will be provided using a Creative Commons CC0 waiver; supported by data citation guidelines similar to the existing publishing conventions. This will ensure re-used data are unambiguously identifiable and that appropriate credit and attribution is made."
UCL supports the use of Creative Commons licenses for all research outputs, including research data.We suggest that you use a Creative Commons Attribution Licence for any data in which copyright arises (e.g. images).
The DCC's guide 'How to License Research Data' gives a useful overview on this question.
In order to reap the benefits of sharing data, it needs to be available in an appropriate place. Your funder may have some expectations about where your data is stored, how open it should be and the timescale in which it should be made available. Please refer to the guidance information on funders' policies.
- In a funder-sponsored data repository
In line with their expectations that data be preserved and made accessible for future use, some funders such as NERC or ESRC have set up data repositories (also known as data centres within NERC) to preserve and disseminate data created as part of their funded projects. Researchers funded by these bodies are expected to deposit in their data repositories.
- In an external open access data repository
If there’s a subject specific repository suitable for your data and your funder doesn’t require you to deposit your data somewhere specific, this is the recommended option as it will allow others in your discipline to find your data more easily. You can find a list of external repositories and their characteristics in the international registry Re3data.org or look at our list of discipline-specific repositories. You can also contact the UCL data stewards for support with managing your data.
- In other repositories, including UCL's repository
If there is not a subject specific data repository that is appropriate for your data, you can deposit your data into UCL’s Research Data Repository. You can find more information about the repository and an FAQ on our website.
You are not able to store sensitive data in the UCL Research Data Repository, but in this case we would recommend the UK Data Service.
There are other open, public repositories for different types of data such as GitHub for software and code or OSF which may be preferable in some limited cases where the project team is predominantly outside UCL, or the staff involved in the project are no longer affiliated.
- Informal sharing
Posting data on a project website and informal peer-to-peer sharing are long-standing practises in academic communities. They are not recommended by UCL. While these approaches to sharing do work, they present risks in terms of long-term sustainability and preservation of data.
For assistance with data sharing/data transfer associated with sponsored research, please see the UCL MTA transfer guidance and the process for raising a new request, supported by Research and Innovation Services.
- Anticipating legal, ethical and commercial constraints to release data
All legal, ethical and commercial constraints should be anticipated as early as possible, when planning your project and writing your Data Management Plan. Questions such as the level of access, the potential audiences and embargoes to release data can then be raised so that you will keep them in mind during your project.
Once you start your project, these questions should be addressed again before you start collecting data. For instance, if you are using consent forms for research participants, a section on data sharing should always be included and then explained to participants once on fieldwork. Check the UK Data Service's guide to consent for sharing data.
If you are dealing with personal or sensitive data, see our dedicated guide for advice.
- Anonymising data
Anonymising data to protect participants' identity where needed is a way to address the issue of sharing personal and/or sensitive data. Before data that contains personal or sensitive information about individuals, organisations or business can be shared it must be anonymised. Anonymising research data can be time-consuming and costly so early planning is recommended. See our guide on handling sensitive and personal information.
The UK Data Service provides comprehensive advice on anonymising both quantitative and qualitative data.
- Describing your data to make them discoverable: metadata, documentation & DOIs
The description of your data should be consistent across your project. You will need to consider discipline-specific recommendations, funder's expectations and your storage and dissemination plans.
As a general rule DataCite recommends that your metadata should at least specify:
- an identifier (a DOI),
- a creator (the name and affiliation of the main researchers involved in producing the dataset),
- a title (the name or title by which the dataset is known),
- a publisher (the name of the entity that holds the dataset),
- a publication date (the year when the dataset was or will be made publicly available) and
- the type of resource you are describing.
If you think that your data requires description with fields that are not covered in the DataCite schema, please contact the Digital Curation Team for advice.
Documentation, like metadata, should accompany data to ensure that these are easily discoverable, understandable and re-useable by other researchers without the help of the data creators. You can choose to document your data in two ways:
3. Digital Object Identifiers (DOIs)
Obtaining a DOI for your data ensures that it can be cited (by you or someone else). It also allows you to track use of your data as others use and cite it, and makes your data uniquely identifiable, so easy to retrieve via a web search.
If you opt to store your data with a reliable repository a DOI for your data will be minted at the point of deposit. See our list of repositories above.
Linking your data to your publications
Many funders expect you to explain in a formal citation how and where supporting data and/or metadata can be accessed (the mention "contact author" is not sufficient).
Citing data that you are re-using
Check our guide on how to re-use data and to choose your reference style.