CATH: Protein Structure Classification

The CATH database is a classification of protein structures in the Protein Data Bank. Protein structures are chopped up into domains and grouped together into superfamilies if there is sufficient evidence that they have diverged from a common ancestor during the process of evolution.

Exploring the Evolution of Protein Domains

Annotating protein domains and identifying the evolutionary relationships between them is carried out through a combination of automated algorithms and expert manual curation.

The video clip above demonstrates the concept of an "ancient structural core", a central idea in protein structure classification. During the clip, a number of domains from the "HUPs" superfamily are superposed so that they line up as best as possible. Many of the proteins in this superfamily have diverged hugely during the process of evolution, however the ancient structural core remains conserved, sometimes over many millions of years and throughout many different organisms.