CATH--a hierarchic classification of protein domain structures. 1997

C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
Department of Biochemistry and Molecular Biology, University College London, UK.

BACKGROUND Protein evolution gives rise to families of structurally related proteins, within which sequence identities can be extremely low. As a result, structure-based classifications can be effective at identifying unanticipated relationships in known structures and in optimal cases function can also be assigned. The ever increasing number of known protein structures is too large to classify all proteins manually, therefore, automatic methods are needed for fast evaluation of protein structures. RESULTS We present a semi-automatic procedure for deriving a novel hierarchical classification of protein domain structures (CATH). The four main levels of our classification are protein class (C), architecture (A), topology (T) and homologous superfamily (H). Class is the simplest level, and it essentially describes the secondary structure composition of each domain. In contrast, architecture summarises the shape revealed by the orientations of the secondary structure units, such as barrels and sandwiches. At the topology level, sequential connectivity is considered, such that members of the same architecture might have quite different topologies. When structures belonging to the same T-level have suitably high similarities combined with similar functions, the proteins are assumed to be evolutionarily related and put into the same homologous superfamily. CONCLUSIONS Analysis of the structural families generated by CATH reveals the prominent features of protein structure space. We find that nearly a third of the homologous superfamilies (H-levels) belong to ten major T-levels, which we call superfolds, and furthermore that nearly two-thirds of these H-levels cluster into nine simple architectures. A database of well-characterised protein structure families, such as CATH, will facilitate the assignment of structure-function/evolution relationships to both known and newly determined protein structures.

UI MeSH Term Description Entries
D008958 Models, Molecular Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer-generated graphics, and mechanical structures. Molecular Models,Model, Molecular,Molecular Model
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D016208 Databases, Factual Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references. Databanks, Factual,Data Banks, Factual,Data Bases, Factual,Data Bank, Factual,Data Base, Factual,Databank, Factual,Database, Factual,Factual Data Bank,Factual Data Banks,Factual Data Base,Factual Data Bases,Factual Databank,Factual Databanks,Factual Database,Factual Databases
D017386 Sequence Homology, Amino Acid The degree of similarity between sequences of amino acids. This information is useful for the analyzing genetic relatedness of proteins and species. Homologous Sequences, Amino Acid,Amino Acid Sequence Homology,Homologs, Amino Acid Sequence,Homologs, Protein Sequence,Homology, Protein Sequence,Protein Sequence Homologs,Protein Sequence Homology,Sequence Homology, Protein,Homolog, Protein Sequence,Homologies, Protein Sequence,Protein Sequence Homolog,Protein Sequence Homologies,Sequence Homolog, Protein,Sequence Homologies, Protein,Sequence Homologs, Protein
D017433 Protein Structure, Secondary The level of protein structure in which regular hydrogen-bond interactions within contiguous stretches of polypeptide chain give rise to ALPHA-HELICES; BETA-STRANDS (which align to form BETA-SHEETS), or other types of coils. This is the first folding level of protein conformation. Secondary Protein Structure,Protein Structures, Secondary,Secondary Protein Structures,Structure, Secondary Protein,Structures, Secondary Protein
D017434 Protein Structure, Tertiary The level of protein structure in which combinations of secondary protein structures (ALPHA HELICES; BETA SHEETS; loop regions, and AMINO ACID MOTIFS) pack together to form folded shapes. Disulfide bridges between cysteines in two different parts of the polypeptide chain along with other interactions between the chains play a role in the formation and stabilization of tertiary structure. Tertiary Protein Structure,Protein Structures, Tertiary,Tertiary Protein Structures
D017510 Protein Folding Processes involved in the formation of TERTIARY PROTEIN STRUCTURE. Protein Folding, Globular,Folding, Globular Protein,Folding, Protein,Foldings, Globular Protein,Foldings, Protein,Globular Protein Folding,Globular Protein Foldings,Protein Foldings,Protein Foldings, Globular

Related Publications

C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
November 2015, Bioinformatics (Oxford, England),
C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
September 2016, Bioinformatics (Oxford, England),
C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
January 1978, Acta cytologica,
C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
April 2023, Nature methods,
C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
January 2013, PLoS computational biology,
C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
January 2001, Nucleic acids research,
C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
November 1995, Science (New York, N.Y.),
C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
December 2015, Biochimie,
C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
September 1949, Archivio di psicologia, neurologia e psichiatria,
C A Orengo, and A D Michie, and S Jones, and D T Jones, and M B Swindells, and J M Thornton
February 2023, Biomolecules,
Copied contents to your clipboard!