UCL Division of Biosciences


Understanding genetic ancestry testing

Genetic ancestry testing is the use of DNA information to make inferences about someone's "deep" ancestry, hundreds or thousands of years into the past. Genetic genealogy on the other hand combines DNA testing with genealogical and historical records, and typically makes use of large databases to identify matches. There is some overlap between the two, but genetic genealogy is generally more reliable because of its use of additional information: the information about your ancestry available from DNA alone is limited, as we try to explain here.

There are three main types of genetic ancestry test:

1) A Y-chromosome DNA (Y-DNA) test provides information about your father line, which in most cultures corresponds with the inheritance of surnames. Only males carry a Y-chromosome, but a female can learn about her father line for example through her father or brother.  Within the current tests there is much variety in the amount of information provided.  The markers tested are of two types, called STRs (short tandem repeats) and SNPs (single nucleotide polymorphisms), which have different mutation rates and so give information at different time depths.  The information you receive depends on how many markers of each type are tested.

SNP testing is used for deep ancestry purposes to provide information about your haplogroup which tells you which branch of the Y-DNA tree you belong to. Y-STR tests are used for genetic genealogy purposes within surname projects to test hypotheses about relationships and to investigate questions about surname origins. These tests also provide information about your deep ancestry by giving you a predicted haplogroup assignment.

If two people have the same Y haplogroup, it means that they will share a common paternal ancestor more recently than two people from different haplogroups, but that common ancestor may still have been a long time ago.  That time can be estimated, but the estimate is not precise with current standard tests, although comprehensive sequencing of the Y-chromosome is becoming available and will give improved precision. 

The haplogroup information is often accompanied by a story about the origin of your ancestors, including a map of the world with arrows indicating ancestral migrations with approximate dates. Hundreds of thousands of men from around the world have now had their Y-DNA tested, and we have a very good idea of the distribution of the different haplogroups in the present-day population. It is, however, difficult to be confident about where these haplogroups originated: many different stories could explain the current distribution. Sometimes a company will associate a haplogroup with, for example, Viking, Norman or Saxon ancestry, but such associations are speculative. Just as today most haplogroups are shared among many populations, so Vikings, Normans and Saxons would have been genetically diverse, and different from the modern populations in their countries of origin.

The father line is just one lineage in your family tree, and as you go further back in time it represents less and less of your total ancestry. For example, you have 64 great-great-great-great grandparents, and a Y-DNA test will only tell us about the ancestry of one of these 64 ancestors.

2) A mitochondrial DNA test provides information about the mother line. Mitochondrial DNA is passed on by a mother to her male and female children but only females can pass their mtDNA on to the next generation. This test, like the Y-DNA test, provides information about one specific lineage - your mother, your mother's mother, your mother's mother's mother, and so on back in time. Again the amount of information provided varies among tests, but the mtDNA sequence is short (just 16,569 DNA "letters") and so sequencing the whole genome is already not very expensive.

An mtDNA test can be used for genealogical purposes to test a hypothesis or to look for matches in a genetic genealogy database.  The mtDNA test also provides a haplogroup which may, like the Y haplogroup, be accompanied by a story and perhaps a "migration" map. We know a lot about the present-day distribution of the mtDNA haplogroups, but it is again much more difficult to make inferences about the more distant past.  The mtDNA mutation rate is relatively high, but because of the short sequence length the time gap between mtDNA mutations can be 100 generations or more, and so common mtDNA ancestors cannot be dated accurately even with full genome data: if you share a full mtDNA sequence with someone, your common maternal ancestor could be 1 or 100 generations ago.

3) An autosomal DNA test provides information from the great majority of your DNA (the autosomes are the chromosomes other than the X, Y and mtDNA).  Although full genome-sequencing is not far away, it remains unaffordable for most and autosomal DNA tests usually examine up to around 1 million genetic markers (SNPs) spread across the genome. These give information about all your ancestors in recent generations, but once you go beyond about 10 generations back into the past (roughly 300 years) only a small fraction of your ancestors have contributed directly to your DNA: so even if William Shakespeare were your ancestor (born ~450 years ago), you almost certainly inherited no DNA from him.  This can be a bit confusing: you did inherit almost all your DNA from ancestors alive at that time, but because there are so many of them (very roughly 30,000 thousand ancestors), you only actually inherited your DNA from a small fraction of them. The unilineal Y and mtDNA are exceptions: you inherited them from all your patrilineal and matrilineal ancestors respectively (the former only if you are male), and so in a sense they can provide a link with very remote ancestors, but they represent only a small fraction of your genes, they provide little information about your ancestors and with only limited inferences about time depth.

Autosomal DNA tests can be used to identify individuals with whom you share a common ancestor up to a handful of generations in the past.  This is done by looking for large chunks of DNA that you both share, and which must have come from one or more recent common ancestors.  Sometimes it happens that a large chunk of DNA is conserved in two individuals from a common ancestor more than 10 generations in the past, but this is very unlikely: the great majority of common ancestors at that time depth will not be identified from your DNA.  Although sharing one or more large chunks of DNA makes it almost certain that the two of you had at least one recent common ancestor, dating the common ancestor(s) is imprecise, particularly beyond about 4 generations ago, and the tests have no ability to distinguish unbalanced from balanced relationships.  By "balanced" we mean the two individuals have the same number of generations back to the common ancestor. For example, using DNA alone the grandparent-grandchild (unbalanced) relationship cannot be distinguished from the (balanced) half-sibling relationship.  Thus algorithms that predict specific relationships become error-prone beyond about the 2nd cousin level.

Autosomal tests also provide information about your ethnicity by identifying sections of your DNA that best match reference databases of modern populations with different geographical or ethnic labels. Ethnicity tests are also known as biogeographical ancestry tests or admixture tests. However, the reference populations used for comparison purposes are limited, the ethnic labels applied to them may be questionable, and they were mainly collected by population geneticists to answer questions about differences between populations rather than questions about individual ancestry.  It is difficult to distinguish between populations within continents: except for regions of the world experiencing recent inter-continental admixture, such as the Americas, human genetic variation usually varies smoothly with geographical distance. Nevertheless the ethnic/geographical assignments have some validity at a large scale.  For example in Latin Americans it is usually possible to distinguish with confidence sections of an individual's genome that are of sub-Saharan African, European and Native American origin. Testing companies will often assign country labels to genetic clusters, but genetic ancestry does not respect country borders, and this practice can result in incongruous results such as French people being assigned with large percentages of "British" ancestry or American people receiving much higher percentages of "British" ancestry than British people. Normandy and Kent are genetically similar, as you might expect from history and geography, so it is not easy to distinguish English from French based on DNA alone, though it should be possible with whole genome sequences and carefully assembled databases. As a result of the random inheritance of DNA, siblings and other close relatives can often receive markedly different ethnicity percentages.

Further reading

The Sense About Science guide Sense about genetic ancestry testing highlights the limitations of genetic ancestry testing.

Debbie Kennett's blog post for Sense About Science Sense about genealogical DNA testing provides an overview of the legitimate uses of DNA testing for genetic genealogy purposes.

A list of related articles can be found in the International Society of Genetic Genealogy (ISOGG) website.  We particularly recommend the following articles:

  • The Guardian Notes and Theories blog by Mark Thomas.
  • "Selling Roots" by Elliot Aguilar in The New Enquiry.
  • Royal et al. (2010) "white paper" published by the American Society of Human Genetics Ancestry and Ancestry Testing Task Force.  These authors say for example that

    • "... moving from [an] inference of common ancestry to the conclusion that the match implies something about the biogeographical ancestry of both individuals can be problematic."
    • "... any quantitative claims about ancestry should have an easily interpreted assessment of confidence or accuracy associated with them" 
    • "... whenever formal inferences about population history have been attempted with uniparental systems, the statistical power is generally low. Claims of connections, therefore, between specific uniparental lineages and historical figures or historical migrations of peoples are merely speculative".
  • American Society of Human Genetics Ancestry Testing Statement, 2008
  • Bandelt et al. BioEssays (2008).
  • Bolnick et al. Science (2007).  These authors say:

    • "... when an allele or haplotype is most common in one population, companies often assume it to be diagnostic of that population. This can be problematic ..."
    • "Many genetic ancestry tests also claim to tell consumers where their ancestral lineage originated and the social group to which their ancestors belonged. However, ..."
    • "the tests ... promote the popular [mis]understanding that race is rooted in one's DNA"
    • "market pressures can lead to conflicts of interest".
  • "Beware the gene genies" by Martin Richards in The Guardian 21/2/03.

    • "Lavish but questionable promises have been made to those who want to trace their genetic ancestry". 

Back to top