November 2012: Publication
13 December 2012
Databases of electronic health records are a valuable source of information for medical research, but much of the information is recorded as free text rather than in a structured way. It has to be manually anonymised to remove information that might identify a patient before it can be used in research, and this is a time-consuming process which is not feasible on a large scale. We developed a computer program called the Freetext Matching Algorithm to extract diagnoses and other information from free text in general practice records, making it easier to use this information in research studies. We tested it on samples of text from the General Practice Research Database, and we are planning to use it on a larger scale to facilitate research into heart disease.