Gene3D takes CATH domain families (from PDB structures) and assigns them to the millions of protein sequences with no PDB structures.
Assigning a CATH superfamily to a region of a protein sequence gives information on its structure and homologous relationships.
CATH superfamilies have a limited set of functions and so the domain family assignments provide functional insights. Furthermore most proteins have multiple
domain families in a specific order (sometimes referred to as the multi domain architecture (MDA)). Identifying proteins with similar domain family organisations can provide further functional insights.
Recently we have subdivided (the sometimes large and functionally
diverse) CATH superfamilies into more functionally coherent functional families (FunFams) (PMID:23514456)
improving the functional insights gained from the domain family
assignments.
There are many other uses of domain family assignments, for
example a certain family may show expansion in a species, and it is
possible to detect these expansions and relate them to evolutionary
pressures.
In regions not assigned a CATH domain we try and predict SUPERFAMILY or Pfam domain families. Combining the resources in this way gives greater domain sequence coverage.