MHC Haplotype Project

The MHC Haplotype Project was conducted between 2000 and 2006 at the Sanger Institute and offers a framework and resource for association studies of all MHC-linked-diseases. It provides the genomic sequences and gene annotation of 8 different HLA-homozygous typing haplotypes (listed below), their resulting variations (see Data below) and ancestral relationships.

The table below lists the eight cell lines used in the project along with their HLA haplotypes and alleles. Links are given to the Sanger Institute Vega database from which sequences and gene annotation can be downloaded, and to the same regions in the UCSC browser for those who prefer that approach.

The sequence from PGF is now incorporated in the reference sequence for chromosome 6. For the other seven haplotype sequences links are given to GRC entries (numbers with GL prefixes). These GRC haplotype contigs, called "alternate loci", are constructed so that they begin with additional anchor sequence derived from the reference. The fasta sequence derived from these links will, therefore, differ from that derived from Vega.

Cell Lines

Haplotype

HLA-A

HLA-B

HLA-C

HLA-DRB1

HLA-DQB1

HLA-DPB1

Links:
Vega UCSC
GRC

PGF

A3-B7-DR15

03:01:01:01

07:02:01

07:02:01:03

15:01:01:01

06:02:01

04:01

Vega UCSC

COX

A1-B8-DR3

01:01:01:01

08:01:01

07:01:01

03:01:01:01

02:01

03:01

Vega UCSC
GL000251.1

APD

A1-B60-DR13

01:01:01:01

40:01:01

06:02:01:01

13:01:01

06:03:01

04:02

Vega UCSC
GL000250.1

DBB

A2-B57-DR7

02:01:01:01

57:01:01

06:02

07:01:01

03:03:02

04:01:01

Vega UCSC
GL000252.1

MANN

A29-B44-DR7

29:02:01

44:03:01

16:01

07:01:01:01

02:02

02:01:02

Vega UCSC
GL000253.1

SSTO

A32-B44-DR4

32:01:01

44:02:01:01

05:01:01:02

04:03:01

03:05:01

04:01:01

Vega UCSC
GL000256.1

QBL

A26-B18-DR3

26:01:01

18:01:01

05:01:01:01

03:01:01:02

02:01:01

02:02

Vega UCSC
GL000255.1

MCF

A2-B62-DR4

02:01

15:01:01:01

03:04:01:01

04:01

03:01

04:02

Vega UCSC
GL000254.1

MHC Haplotype Project Data

The data from the MHC Haplotype Project are available in .txt file format file for viewing in the UCSC Genome Browser.

View the file

Follow UCSC instructions for the loading of BED file data as a custom track. The data are colour-coded by haplotype and will initially be displayed showing 1Mb in the centre of the MHC. You are then free to adjust the co-ordinates and to zoom in to your region of interest.

When zoomed in you should change the display of the Sanger_MHC custom track from "dense" to "full". You may also want to unhide "Variation and Repeats SNPs (131)" and "Genes and Gene Prediction Tracks Vega genes".

Although the project data were originally issued in co-ordinates of a previous release of the Human Genome they have been converted using the UCSC tool LiftOve r for use with the February 2009 GRCh37/hg19 Assembly.

The convention used for naming variations is:

[haplotype name]:[BAC sequence SV number]_[base position in SV]_[variation]

For single nucleotide polymorphisms "variation" consists of two letters, firstly, the base in the reference sequence, and secondly, the base in the other haplotype. Insertions and deletions are identified by "_i" and "_d" respectively, followed by the numerical value of their length, and their base sequence, if this is 12 bases or less. For longer sequences an "X" value is given which refers to a look-up table.

Publications

Horton et al 2008 Immunogenetics 60(1):1-18.
Traherne et al 2006 PLoS Genet. 2(1):e9.
Stewart et al 2004 Genome Res. 14(6):1176-87.
Allcock et al 2002 Tissue Antigens 59(6):520-1.

MHC Haplotype Project

MHC Haplotype Project Data

Publications

Project consortium