THIN Database


The Health Improvement Network (THIN) database represents a collaboration between two companies; In Practice Systems (INPS) - who developed Vision software used by general practitioners (GPs) in the UK to manage patient data, and IMS Health who then provide access to the data for use in medical research. THIN data are collected during routine practice and regularly delivered to THIN. Since THIN data collection began in 2003, over 500 Vision practices have joined the scheme.

Research studies for publication conducted using THIN data are approved by a nationally accredited ethics committee which has also approved the data collection scheme.

The UCL Research Departments Primary Care & Population Health (PCPH) and Infection & Public Health (IPH) have acquired a full license to THIN for the purposes of conducting large-scale epidemiological, clinical and health care utilisation studies.


THIN data currently contains the electronic medical records of 11.1 million patients (3.7 million active patients) equivalent to 75.6 million patient years of data collected from 562 general practices in the UK, covering 6.2% of the UK population. All data are fully anonymised, processed and validated by CSD Medical Research UK.

Data structure

The patient data are arranged in four standardised and one linked file per practice (Table 1). Diagnoses are coded in hierarchical Read codes, which allow some standardisation of the way that information is recorded. The codes are grouped in to themed “chapters” (e.g. cancer) and include terms relating to symptoms, diagnoses, procedures, and laboratory tests. Prescriptions are currently entered using Multilex codes issued by First Databank, which can be easily linked to British National Formulary (BNF) codes. Most practices now have laboratory data automatically downloaded into the electronic medical record.

It is also possible to identify people living at the same address and mothers and children living at different addresses based on date of birth and delivery codes.

Table 1: Main file types of THIN data

PATIENT age, sex, registration date when entering the practice, and date when leaving the practice
MEDICAL medical diagnoses, date of diagnosis, and location (e.g., GP’s office, hospital, consultant) of the event and an option for adding free text;
referrals to hospitals and specialists.
THERAPY all prescriptions along with the date issued, formulation, strength, quantity, and dosing instructions, indication for treatment for all new prescriptions (inferred from cross reference to medical events on the same date), and events leading to withdrawal of a drug or treatment.
ADDITIONAL HEALTH DATA (AHD) vaccinations and prescription contraceptives; miscellaneous information such as smoking, height, weight, immunizations, pregnancy, birth, death, and laboratory results.
POSTCODE VARIABLE INDICATORS (PVI) postcode linked area based socio-economic, ethnicity and environmental indices


date, time and duration of consultation


 gender and roles of staff who entered the data

Each entry in a patient’s medical file can have comment associated with it as the Vision system allows the entry of free text or scanned information. 158,037 comments, including the 10,000 most frequently used in the medical records, have now been anonymised, which equates to 35% of all comments in the medical records. A seven character numeric identifier has been linked to each unique comment and the text-id field in the medical records has been updated with this identifier. An additional ancillary file is now available with the identifiers and the free text string. This file is called THINComments.

It may be possible to obtain further patient information via the Additional Information Service (AIS) including:

• anonymised questionnaires completed by the patient or GP
• copies of patient-based correspondence
• a specified intervention (e.g. a laboratory test to confirm diagnosis)
• death certificates

The cost of utilising AIS will vary depending on the type and quantity of information requested and will require full MREC approval.

Some of the strengths and limitations of THIN data.

Page last modified on 14 apr 15 12:23 by Rebecca K Lodwick