XClose

Library Services

Home
Menu

Copyright for research data and software

Guidance on copyright related to research data, software and code, to support Open Science practices.

UCL's Research Data Policy defines research data as "facts, observations or experiences on which an argument or theory is constructed or tested”. Instances of research data include, but are not limited to: laboratory notebooks; field notebooks; questionnaire and test responses; texts; audio files; video files; models; photographs.

Rights to research data, software and code

Are research data protected by copyright?

In many cases, yes. While facts and raw data are not protected by copyright, copyright law does protect any works constituting research data which can be deemed to be ‘literary, artistic, dramatic or musical works, sound recordings, films or broadcasts or typographical arrangements of published editions’. In practice, this means that your lab notebooks, protocols, spreadsheets, interview transcripts, test responses, images and recordings are covered by copyright; and so are computer programs and databases as these, too, are considered ‘literary’ works.

Databases

collections of independent works, data or other materials arranged in a systematic or methodical way and individually accessible by electronic or other means” - (Database Directive; Copyright and Rights in Databases Regulations 1997) may be protected both by copyright and database rights.

Copyright may protect (a) the structure of the database if some intellectual input has gone into selecting and/or organising its contents; (b) the individual components of a database, e.g. if it is a collection of texts or images. See Section 3A of Copyright, Designs and Patents Act 1988.

The database right is a separate form of intellectual property that applies to databases if “there has been substantial investment in obtaining, verifying or presenting the contents”. “Investment” may include financial, human or technical resources. See Part III of The Copyright and Rights in Databases Regulations 1997.

Are software and code protected by copyright?

Software and code may be protected by copyright. The Copyright Act 1988 (CDPA) states that computer programs and the preparatory design material for a computer program (e.g., flow charts and technical specifications) are protected as literary works. While the CDPA dos not define the term ‘computer program’, it is understood to include the source code.

Software programs are often more complex works that include several artistic and functional elements. Software may be protected by both copyright and database rights. In some special cases, software may also be protected as a patentable invention in the UK, if certain criteria are met (see manual of patent practice on the Gov UK website). Additional legal protection may apply via confidentiality agreements and licence agreements.

Who owns the copyright to my research data, software and code?

According to UCL’s Intellectual Property policy and related guidance, UCL holds the copyright (and where applicable, database rights) to the following, if they were discovered or created by UCL staff in the course of duties:

  • Materials. Any materials, models, data, prototypes, compounds samples or physical items or objects;
  • Databases;
  • Technical specifications, technical designs or other works which may help support or protect commercialisation in patentable inventions, trade secrets, technical know-how, commercially exploitable products or other innovations;
  • Computer programs, including but not limited to software, source code, object code, and preparatory design materials.

In practice, as part of your research activities and in line with UCL’s Research Data Policy, most the time you will make decisions about where, when and how you will license and share your research data to make it as open and re-usable as possible. Likewise, in most cases you should be able to decide how you will share and license software that you develop.

By default, UCL students own any IP they generate. However, there may be some complexity around defining ownership if, for example, the work is sponsored or produced in collaboration with staff. For full details, please see UCL’s Intellectual Property policy.

Why is it important to clarify rights related to research data?

When entering a research contract, for example with a sponsor or external collaborator, it is important to understand and, where possible, negotiate rights that determine whether, when and how the outcomes of the research can be exploited, shared and reused. This is in line with UCL’s research data policy, which states that: “Researchers should (…) establish and document agreements for research data management when involved in a joint research project, collaborative research or research undertaken in accordance with a contractual agreement”.

This is crucial for several reasons, some of which are:

  • to protect and exploit commercially valuable IP
  • to determine the terms of transferring materials and data between organisations
  • to ensure that you can share your data as openly as possible
  • to meet various policy requirements, including open science and legal requirements that may differ across countries, sectors and institutions.

Ownership, storage and sharing of research data, software and other outputs produced through the research cycle can be addressed in different types of contracts, including collaboration agreements, consortium agreements, confidentiality/non-disclosure agreements, materials transfer agreements and data access/sharing agreements. Please contact your Research Data Support Officer for templates you can use around transfer of rights and licensing. Where applicable, you are also advised to follow the policy and guidance on materials/data transfer.

Licensing research data, software and code: what to consider 

Why is it important to license data, software and code?

Research datasets and databases are normally protected by IP legislation. This means that you need to work within the current legal IP frameworks to specify in what ways others can, not only access, but also reuse your data (e.g., distribute, use as secondary data). Licences allow you to specify terms of reuse, in line with the FAIR data principles.

Likewise, it is essential to specify whether and how others can reuse your software and code, to enable others to reuse it as you intended. Licences for software and code address terms of accessing, copying, modifying and redistribution of the original.

Crucially, open licences also support open science practices. Applying an open licence orwaiver to your data, software, code and other outputs such as hardware, supports openness, collaboration and innovation. See related guidance on the software sustainability page.

What do I need to consider when licensing data, software and code?

This checklist will help you make a decision around licensing. Information around these points should be captured in the metadata describing your work and/or in the licence deed.

Ownership and provenance

  • Who owns the rights to the data?
  • What is the provenance of the data? Where were the data (original and subsequent versions) created, stored and processed?

Legal, contractual and policy requirements

  • Are there any legal or contractual requirements that the data be restricted, permanently or temporarily? If ethical, commercial and legal factors require data to be available on request only, you will need to provide an end user agreement specifying under what terms the data can be reused.
  • Are the data the outcome of a funded project? Does the funder’s policy require a specific waiver or licence to be applied?

Terms of reuse

  • Can the data, software and code be accessed publicly? If they can only be available on request, what are the terms of granting access and reuse?
  • Will you make your software/code open source?
  • Would you require attribution?
  • If the data can be openly licensed, does the licence allow modifications to the original (derivatives)?
  • If derivatives are permitted, do you require that they are shared under the same licence terms?
  • Would you allow commercial reuse?  

Licence options for research data

What is UCL’s position on licensing research data?

UCL’s policy is that “following primary use (e.g. publication) or when research data is archived for long term preservation, these data will be made available in the most open manner appropriate”.

In particular: “unless covered by third party contractual agreements, legislative obligations or provisions regarding ownership, UCL research data will be provided using a Creative Commons CC0 waiver; supported by data citation guidelines similar to existing publishing conventions.  This will ensure that re-used data are unambiguously identifiable and that appropriate credit and attribution is made”.

What is a CC0 waiver and why is it considered a good option for research data?

Applying a Creative Commons CC0 waiver to your research data means that you waive all copyright and database rights and release the work to the public domain. This effectively allows others to access, copy, modify and distribute it, without a legal obligation to cite the source. Attributing the source, however, would still be expected as a matter of research integrity. When applying the CC0 waiver, you can request to be attributed.

Applying the CC0 waiver has several advantages. The waiver:

  • Removes uncertainty on whether factual data and databases should be protected in the first instance.
  • Supports open science and open education by allowing reuse of the data in various contexts. For example, the datasets can be used in teaching, or used as secondary data to produce new research.
  • Does not affect patent rights.
  • Offers a process enabling you to request to be cited.

See more information on the Creative Commons wiki.

What licence options can be applied to research data?

Creative Commons licences help you determine how others would share, adapt and reuse your data around three main criteria:

  • Can others modify (i.e., create a new dataset based on the original)?
  • Can others reuse the dataset commercially?
  • If others share the dataset or a derivative, would you ask them to share it under the same licence (or a licence with the same reuse terms)?

All Creative Commons licences require attribution: others reusing the dataset should credit the creators and the source.

The six Creative Commons licences are built around these criteria; for example, CC BY-ND does not allow derivatives, CC BY-NC does not allow commercial reuse, and CC BY NC-SA allows commercial reuse, as long as the work is made available under the same licence. Not all the licences support open and reproducible research, as the ND and NC terms impose restrictions. A useful tool for considering which licence to apply is the UFAL Github liecnce selector.

Open Data Commons also offer three licences specifically designed for data:

Which licences (besides the CC0 waiver) ensure data are open and reusable?

As set out by the Open Definition:

Open data is data that can be freely used, re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.

If it is not appropriate to waive rights to your research data, Creative Commons licences that meet the Open Definition are the Creative Commons Attribution licence (CC BY) and the Creative Commons Attribution ShareAlike licence (CC BY-SA). These two licences support:

  • reuse and redistribution
  • universal participation, meaning that no restrictions are imposed to specific groups of people or to certain purposes, including commercial purposes
  • interoperability: intermixing with other datasets to create larger systems.

Open Data Commons licences also support open data. Creative Commons licences were originally designed to license content (e.g. images, test responses) rather than databases (i.e. the structure of a database), but overall they are perceived as more flexible than ODC licences. See a brief comparison on the Creative Commons wiki.

How do I apply a Creative Commons licence (or a waiver) to my data?

You don’t need to register a Creative Commons licence: all you have to do is share the terms of the licence  alongside the work you are licensing. The Creative Commons Licence Chooser tool helps you choose a licence, generate standard text specifying the terms of the licence, embed the licence on your website if necessary, and determine how you want to be attributed. Similarly, the CC0 tool helps you generate standard text that you can apply to your work if you wish to release it to the public domain.

To apply the licence to your research data, paste the licence text into your metadata file (this can be a readme file shared along with your dataset). You may also want to add the licence text to the dataset file itself.

If you are depositing the data in the UCL research data repository add the licence, preferably with a link to the licence website, in the ‘licence’ metadata field.

Licence options for software and code

What is UCL’s position on licensing software and code?

UCL’s statement on transparency in research expects researchers, as far as is possible and appropriate, to “make their research methods, software, outputs and data open, and available at the earliest possible point”. Where software is concerned, researchers are encouraged to deposit their software in suitable repositories such as GitHub and Zenodo.

In addition to making software openly available, you are supporting open science if you share the source code in ways that allow others to inspect, modify, and enhance it (open source software).

There are many open source software licences. What do I need to consider before I choose one?

Guidance around software licences is not prescriptive, and ultimately it is up to you to choose a licence that meets your purposes. When choosing a licence, we suggest that you consider the following:

1.Know the principles of open source.

‘Open source’ extends beyond simply making the source code available for free. In addition to both the software and the source code being licensed for free redistribution,  open source software must also follow a set of principles that allow modification and derived works; remove barriers to access and reuse of future versions; ensure that reuse is not confined to specific groups of users or to specific purposes; and ensure that the terms of the licence are not tied to specific distributions of products or to  technologies. If you are considering making your software open source, please read the full criteria of the Open Source Definition.

A number of open source licences have been approved by the Open Source definition as meeting the criteria.

2. Think of factors that may affect your choice.

  • Ownership. Consider terms of ownership and sharing; for example, if you have co-written the code, if your product is the result of a sponsored project, or if there is any other agreement in place preventing you from licensing and sharing your software 
  • Commercial potential. If your software has commercial potential, you are advised to discuss this with UCL's Innovation and Enterprise services. In some cases, you may still be able to make your software open source, as this does not preclude commercialisation 
  • Patents. If your software is protected by a patent, you may want to choose a licence that explicitly addresses this by granting or denying patent rights to users 
  • Dependencies. If your software has dependencies, i.e. uses libraries, code or existing software, you need to take into account how these are licensed, and whether and how you can apply a licence compatible with existing ones, or indeed whether you can apply an open source licence at all. Compatibility between licences of different versions or components ensures that conditions from different licences can be met. You can learn more about compatibility on the EUPL website.

3. Consider whether you would allow proprietary ('closed source') versions.

Your choice of licence is very much determined by how would you like others to distribute adaptations.

If you would like to allow others to use the code in developing proprietary products, then you can use a permissive licence. This means that,  while your original product is open source, the licence allows others to create modifications that are released under less permissive terms (closed source). 

If it is important to you that future modifications of your product remain open source, you may prefer to apply a copyleft licence. If someone develops software using some of your code, you may require that their whole software be distributed under your open source licence ('strong' copyleft terms). Alternatively, you may only require that certain components of the new product that are based on your code are open source, while other parts may be proprietary ('weak' copyleft terms). Specifying which components should be under your licence and which can be proprietary can be done at the module level, file level or library level.

Copyleft licences applied by others are important when your own software has dependencies. It is therefore important that you are aware of such terms if you are using components created by others.

4. Consider whether you would require attribution.

Some licences have terms requiring users to name you when distributing the product or its modifications.

All free or open source software licences specify that anyone who distributes or adapts the software must give credit to the original authors of the software somewhere in their distribution. Some free or open source software licences go further than this, and specify that the credit must take a particular form and appear in specific instances, for example on the software’s user interface every time it is run. This is sometimes called ‘enhanced attribution’.

Which licence should I choose?

There is no single licence that will suit every scenario. You should therefore consider the main criteria (see There are many open source software licences. What do I need to consider before I choose one?, above) - particularly whether your project has dependencies e.g. you are using someone else's licensed libraries, and whether you would like to allow proprietary versions or not. A good place to start is to look at licences commonly used in your community.

The UFAL Github licence selector is a wizard-style tool guiding you through licence options for your software, and taking into account compatibility issues.

The choose a licence resource is also a useful guide. If you are using this tool and your software has dependencies, you may want to check compatibility with the licence you want to choose. A useful tool is the EUPL compatibility checker.

Below is a list of the most commonly used open source licences, with information on how they meet the key criteria.

  • The MIT Licence is a simple permissive licence, allowing proprietary reuse, without a requirement to state whether changes were made or to distribute the source code. The licence requires a copyright/licence notice. Other permissive licences include the Apache 2.0 licence, which requires disclosing the source code and also includes permission for patent reuse; and the 3-clause-BSD licence.
  • The GNU General Public Licence v3.0 is a strong copyleft licence, allowing modifications while requiring that future versions are distributed with the same licence. There is a requirement to indicate any modifications made, and to share the source code. There is also explicit permission for patent reuse. Another popular strong copyleft licence is the GNU Affero General Public Licence v3.0.
  • The GNU Lesser General Public Licence v3.0 is a 'weak' copyleft licence: modifications of complete works distributed as a whole must be distributed under the same or a compatible licence (GNU GPLv3). . However, if a work under this licence is used as a component of a larger work, it is not a requirement to license the larger work with the same licence. Similar terms apply in the Mozilla Public Licence 2.0 (MPL2.0).
Can I apply a Creative Commons licence to my software?

Creative Commons licences support openness and reuse of many creative works, including research publications, teaching materials and datasets. However, they do not support the licensing of software and code.

Open source licences have been developed specifically for software and code. Github provides a useful resource guiding you through choosing a suitable open source licence. Points to consider when making your software open source include whether the licence suits the needs of your community, how permissive you would like the licence to be, and whether you would require others to share their adapted versions under the same terms of reuse.

How do I apply a licence to my software/code?

Applying a licence typically involves creating a text file that includes the terms of the licence and including it in the source code.Naming conventions for this file vary by licence. Additional guidance per licence can be found on every licence page on the choose a licence resource.

Where can I learn more about open source software and code?

Here are some useful resources: