Curation Process

High quality human gene annotations are generated through a combination of computational and manual techniques (Barrell et al., 2009), Dimmer et al., 2012), both of which require a team of skilled biologists and software engineers.

Manual gene annotation involves the extraction of information from published scientific papers (Balakrishnan et al., 2013, Orchard et al., 2014). Every GO or PPI annotation is attributed to an identified reference by use of a publication identifier and each annotation must indicate what kind of evidence supports the association between the gene product and the GO term, or the protein-protein interaction.

The large-scale assignment of GO terms to human gene products using computational methods is a fast and efficient way of associating high-level terms to a large number of genes. However, to provide more reliable and specific annotations, GO curators use information from the published scientific literature to ‘manually’ associate highly descriptive GO terms to gene products. Similarly, PPI data is captured from both high-throughput datasets, such as yeast-2-hybrid experiments as well as from small scale experimental data.

Consequently complete, highly detailed annotation of the processes and networks that a single gene product is involved in, may take a considerable time, depending on the number of published papers describing the gene product, with 3-4 experimental papers annotated a day.

Page last modified on 12 mar 14 20:18

The work of the Cardiovascular Gene Annotation group is supported by British Heart Foundation grant RG/13/5/30112. The work of the Neurological Gene Annotation group is supported by Parkinson's UK grant G-1307.