Guidance on how to share research data and the potential benefits this brings.
Research data frequently remains valuable after the life of the research project for which it was generated. Sharing data can open up new avenues of research and enquiry without researchers having to recreate and collect identical data. Due to this, and other factors, research funders are increasingly requiring researchers to share their data.
Researchers should plan how and when to share their data (if it can be shared), whether there are reasons it can not be shared, and how to ensure that other researchers will be able to make correct use of it.
Why share?
There are many motivations to share research data. Many funders view them as a public good that should be shared with the academic community and beyond.
- Benefits of sharing research data
- Increases the academic profile of researchers, by ensuring credit is given to data as a research output in its own right.
- Increases the impact and visibility of research.
- Complies with many funders' requirements, including making best use of investment by avoiding replication.
- Allows data to be independently validated and tested.
- Leads to new collaborations and partnerships.
- Provides great resources for education and training.
It is not always possible to share data, or you may wish to restrict access for some of the following reasons:
- Constraints
- Your data includes sensitive and/or personal information. Please see our guide for more information on handling sensitive and/or personal information.
- You intend to make a patent application or there is potential to generate revenue from your research.
- Your data is the result of collaborative or externally funded research governed by a confidentiality agreement.
Handling copyright, Intellectual Property Rights (IPR) and licences issues
Background information is available to help you understand better copyright, IPR and licences. UCL copyright blog gathers answers to a large range of questions related to these issues.
- UCL IPR Policies
The basic principle of UCL's staff and student IPR policies is that IP rights rest with the individual student or staff member who creates a work which is protected by those rights. UCL waives the right, which it could claim under the legislation, to own the copyright in the work of employees which is created during the course of employment. UCL claims a wide-ranging licence to re-use work created by staff for its own purposes. There are some exceptions around collaborative and funded projects so researchers are advised to read the UCL Staff IPR Policy.
- Copyright and data
There is no copyright in pieces of raw data (i.e. which have been collected or generated in the course of research but have not yet been analysed or manipulated) since copyright protects original creative activity rather than facts.
Any other data are covered by copyright.
A database containing raw data may qualify for copyright protection if the selection and organisation of the data "…constitutes the author's own intellectual creation". A database which does not qualify for Copyright protection may still be protected by the separate Database right which offers more limited protection to the investment of resources which has gone into collecting the data and assembling it. In the context of the Staff IPR Policy, the Database Right is claimed by UCL.
- Licencing your data
Licensing your data sets out clearly the terms of use of your data, potentially avoiding future complications.
Researchers involved in collaborative or sponsored work, including research students contributing to research projects, should reach agreement prior to commencing work as to how data will be licensed, including signing agreements to transfer any rights to a Principal Investigator, institution, sponsor or similar. Templates can be provided; please contact your Research Data Support Officer for these.
UCL's Research Data Policy sets out UCL's expectations around the licensing of data created by UCL researchers:
"Unless covered by third party contractual agreements, legislative obligations or provisions regarding ownership, UCL research data will be provided using a Creative Commons CC0 waiver; supported by data citation guidelines similar to the existing publishing conventions. This will ensure re-used data are unambiguously identifiable and that appropriate credit and attribution is made."
UCL supports the use of Creative Commons licenses for all research outputs, including research data.We suggest that you use a Creative Commons Attribution Licence for any data in which copyright arises (e.g. images).
The DCC's guide 'How to License Research Data' gives a useful overview on this question.
Sharing Data: when & where?
In order to reap the benefits of sharing data, it needs to be available in an appropriate place. Your funder may have some expectations about where your data is stored, how open it should be and the timescale in which it should be made available. Please refer to the guidance information on funders' policies.
You have the choice between different options to deposit your data:
- In a UCL repository
UCL Discovery:
If you need to make publicly accessible small datasets underpinning your publications you can use this infrastructure. Guidance is available. Email the Discovery support Team for help.
UCL Digital Collections:
If you have finished working on your data you can use this infrastructure. Information is available. Email the Digital Curation Team for help.
An ISD archive service for long term preservation and sharing of research data is currently in its pilot phase.
- In a funder-sponsored data centre
In line with their expectations that data be preserved and made accessible for future use, some funders such as NERC or ESRC have set up data centres to preserve and disseminate data created as part of their funded projects. Researchers funded by these bodies are expected to deposit in their data centres.
- In an external open access data repository
A third option is to deposit your data with a "responsible digital repository", i.e. that "takes responsibility for data assets according to the FAIR data principles: findable, accessible, interoperable, and re-usable" and provides you with "adequate and persistent information" (i.e. metadata and a DOI) for your data. A list of external repositories and their characteristics can be found in the international registry Re3data.org. We have also prepared a guide where you will find discipline-specific repositories.
It is a good idea to ask your departmental IT specialist for advice.
- Deposit with a journal
Some journals require supporting data is made available promptly as a condition of publication. Authors may be asked to provide editors and peer-reviewers with access to the data at submission, with data sets being made freely available to readers at the point of publication. Contact the Open Access Team for more information on publisher requirements.
- Informal sharing
Posting data on project website and informal peer-to-peer sharing are long-standing practises in academic communities. They are not recommended by UCL. While these approaches to sharing do work, they present risks in terms of long-term sustainability and preservation of data. With the former, there is little or no control over who uses data and how and the latter requires finding the right contact.
How to share data and arrange a material transfer agreement
For assistance with data sharing/data transfer associated with sponsored research, please see the UCL MTA transfer guidance and the process for raising a new request, owned by Research and Innovation Services, Visit UCL Material Transfer Agreements (MTA) guidance.
- Anticipating legal, ethical and commercial constraints to release data
All legal, ethical and commercial constraints should be anticipated as early as possible, when planning your project and writing your Data Management Plan. Questions such as the level of access, the potential audiences and embargoes to release data can then be raised so that you will keep them in mind during your project.
See the UCL Research Integrity website for a list of resources, statements and codes of conduct related to ethics.
Once you start your project, these questions should be addressed again before you start collecting data. For instance, if you are using consent forms for research participants, a section on data sharing should always be included and then explained to participants once on fieldwork. Check the ESRC guide on consent for sharing data.
If you are dealing with personal or sensitive data, see our dedicated guide for advice.
- Anonymising data
Anonymising data to protect participants' identity where needed is a way to address the issue of sharing personal and/or sensitive data. Before data that contains personal or sensitive information about individuals, organisations or business can be shared it must be anonymised. Anonymising research data can be time-consuming and costly so early planning is recommended. See our guide on handling sensitive and personal information.
The UK Data Archive provides comprehensive advice on anonymising both quantitative and qualitative data.
- Describing your data to make them discoverable: metadata, documentation & DOIs
The description of your data should be consistent across your project. You will need to consider discipline-specific recommendations, funder's expectations and your storage and dissemination plans.
1. Metadata
If you don't need to follow a discipline-specific schema, or funder's recommendations, then we advise you to use the DataCite metadata schema (see p.7 for a summary table).
As a general rule DataCite recommends that your metadata should at least specify:
- an identifier (a DOI),
- a creator (the name and affiliation of the main researchers involved in producing the dataset),
- a title (the name or title by which the dataset is known),
- a publisher (the name of the entity that holds the dataset),
- a publication date (the year when the dataset was or will be made publicly available) and
- the type of resource you are describing.
If you think that your data requires description with fields that are not covered in the DataCite schema, please contact the Digital Curation Team for advice.
2. Documentation
Documentation, like metadata, should accompany data to ensure that these are easily discoverable, understandable and re-useable by other researchers without the help of the data creators. You can choose to document your data in two ways:
- Description at study or collection-level,
- Description at data-level.
If you have any questions regarding metadata and documentation of data, please contact your Research Data Support Officers or the Digital Curation team.
3. Digital Object Identifiers (DOIs)
Obtaining a DOI for your data ensures that it can be cited (by you or someone else). It also allows you to track use of your data as others use and cite it, and makes your data uniquely identifiable, so easy to retrieve via a web search.
If you opt to store your data with a reliable repository a DOI for your data will be minted at the point of deposit. See our list of repositories above.
Citing data
- Linking your data to your publications
Many funders expect you to explain in a formal citation how and where supporting data and/or metadata can be accessed (the mention "contact author" is not sufficient).
See guidance to write a formal citation and examples of Data Access Statements (for restricted and non-restricted data) to include in your publications.
- Citing data that you are re-using
Check our guide on how to re-use data and to choose your reference style.