UCL Institute of Health Informatics


UCL-THIN Long Covid Study

University College London (UCL), in collaboration with The Health Improvement Network (THIN), a Cegedim company, is carrying out a study on Long Covid.

The study will use information in GP patient records to improve our understanding of Long Covid, so that patients can receive an earlier diagnosis and better care.

The study is being carried out using the THIN research database, which is already used for many medical research studies to help improve patient care.

What is the study trying to find out?

The study will investigate the full range of physical and mental health symptoms of COVID-19, by comparing people who have had COVID-19 with those who have not. Researchers will analyse information about people's symptoms in the clinical notes typed by the GP (‘free text’), as well as diagnoses, medication and test results.

What is ‘free text’?

Some of the information in health records is in the form of numbers or codes, such as diagnosis codes or laboratory measurements, and is called ‘structured’ or ‘coded’ data. Structured data does not include words and sentences typed by the doctor in the clinical notes, which often contain information about patient symptoms. These clinical notes are called ‘free text’.

This study will use both types of information (structured information and free text) to improve our understanding of Long Covid.

A computer screen shows clinical codes and other medical data matched to clinical notes about covid.

How are patient data used for the study kept safe and private?

Patient data for this research study will be anonymised and stored securely at all times and will be accessible only to a limited number of authorised staff to conduct the research. This study has been approved by a NHS Health Research Authority Ethics Committee.

Structured information in health records is extracted automatically for research in an anonymised way, without including personal identifiers such as name or address. Free text notes in GP records are also extracted to THIN for this study, and computer programs are then used to convert the free text into anonymised structured data. Under strict conditions, approved researchers may view samples of the free text within a secure computing environment to ensure that the extracted information is accurate.

Are other research studies being done on long Covid?

The UCL-THIN study is one of a number of research studies taking place to find out more about Long Covid. Each study has strengths and weaknesses, which is why it is necessary to carry out different types of studies to find out different pieces of information about Long Covid.

Some studies involve patients attending specialist clinics. This gives detailed insights into Long Covid but is limited to patients who attend a specialist clinic.

Some studies involve patients reporting their own symptoms; they give a detailed understanding of people’s experiences and are not limited to the symptoms reported to doctors. However, they do not have a comparison group so it is difficult to be sure that the symptoms are definitely due to Long Covid.

This UCL-THIN study will provide another piece of the jigsaw by comparing GP patient records between people with and without a history of COVID-19. It will be particularly relevant to help patients attending their GPs with symptoms that may be due to Long Covid.

Who is funding the study?

The study is funded by the National Institute of Health Research, as part of a grant to UCL for the project “Characterisation, determinants, mechanisms and consequences of the long-term effects of COVID-19: providing the evidence base for health care services”.

Who are the study team?

The study is led by the UCL Institute of Health Informatics by Dr Anoop Shah. The study team is a collaboration of researchers from UCL and other universities, clinicians, and patient and public members.