Sequence-structure matching in globular proteins: application to supersecondary and tertiary structure determination. 1992

A Godzik, and J Skolnick
Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037.

A methodology designed to address the inverse globular protein-folding problem (the identification of which sequences are compatible with a given three-dimensional structure) is described. By using a library of protein finger-prints, defined by the side chain interaction pattern, it is possible to match each structure to its own sequence in an exhaustive data base search. It is shown that this is a permissive requirement for the validation of the methodology. To pass the more rigorous test of identifying proteins that are not close sequence homologs, but that have similar structure, the method has been extended to include insertions and deletions in the sequence, which is compared to the fingerprint. This allows for the identification of sequences having little or no sequence homology to the fingerprint. Examples include plastocyanin/azurin/pseudoazurin, the globin family, different families of proteases and cytochromes, including cytochromes c' and b-562, actinidin/papain, and lysozyme/alpha-lactalbumin. Turning to supersecondary structure prediction, we find that alpha/beta/alpha fragments possess sufficient specificity to identify their own and related sequences. By threading a beta-hairpin through a sequence, it is possible to predict the location of such hairpins and turns with remarkable fidelity. Thus, the method greatly extends existing techniques for the prediction of both global structural homology and local supersecondary structure.

UI MeSH Term Description Entries
D010446 Peptide Fragments Partial proteins formed by partial hydrolysis of complete proteins or generated through PROTEIN ENGINEERING techniques. Peptide Fragment,Fragment, Peptide,Fragments, Peptide
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D012997 Solvents Liquids that dissolve other substances (solutes), generally solids, without any change in chemical composition, as, water containing sugar. (Grant & Hackh's Chemical Dictionary, 5th ed) Solvent
D013329 Structure-Activity Relationship The relationship between the chemical structure of a compound and its biological or pharmacological activity. Compounds are often classed together because they have structural characteristics in common including shape, size, stereochemical arrangement, and distribution of functional groups. Relationship, Structure-Activity,Relationships, Structure-Activity,Structure Activity Relationship,Structure-Activity Relationships
D013816 Thermodynamics A rigorously mathematical analysis of energy relationships (heat, work, temperature, and equilibrium). It describes systems whose states are determined by thermal parameters, such as temperature, in addition to mechanical and electromagnetic parameters. (From Hawley's Condensed Chemical Dictionary, 12th ed) Thermodynamic
D016415 Sequence Alignment The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms. Sequence Homology Determination,Determination, Sequence Homology,Alignment, Sequence,Alignments, Sequence,Determinations, Sequence Homology,Sequence Alignments,Sequence Homology Determinations
D017433 Protein Structure, Secondary The level of protein structure in which regular hydrogen-bond interactions within contiguous stretches of polypeptide chain give rise to ALPHA-HELICES; BETA-STRANDS (which align to form BETA-SHEETS), or other types of coils. This is the first folding level of protein conformation. Secondary Protein Structure,Protein Structures, Secondary,Secondary Protein Structures,Structure, Secondary Protein,Structures, Secondary Protein
D017434 Protein Structure, Tertiary The level of protein structure in which combinations of secondary protein structures (ALPHA HELICES; BETA SHEETS; loop regions, and AMINO ACID MOTIFS) pack together to form folded shapes. Disulfide bridges between cysteines in two different parts of the polypeptide chain along with other interactions between the chains play a role in the formation and stabilization of tertiary structure. Tertiary Protein Structure,Protein Structures, Tertiary,Tertiary Protein Structures

Related Publications

A Godzik, and J Skolnick
October 1977, Biopolymers,
A Godzik, and J Skolnick
January 2019, Methods in molecular biology (Clifton, N.J.),
A Godzik, and J Skolnick
December 1987, Journal of molecular biology,
A Godzik, and J Skolnick
July 1975, Journal of the American Chemical Society,
A Godzik, and J Skolnick
January 2019, Methods in molecular biology (Clifton, N.J.),
A Godzik, and J Skolnick
June 1978, Journal of theoretical biology,
A Godzik, and J Skolnick
April 1979, Journal of theoretical biology,
A Godzik, and J Skolnick
January 2013, Methods in molecular biology (Clifton, N.J.),
Copied contents to your clipboard!