A supersecondary structure library and search algorithm for modeling loops in protein structures. 2006

Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
Department of Biochemistry and Seaver Foundation Center for Bioinformatics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA.

We present a fragment-search based method for predicting loop conformations in protein models. A hierarchical and multidimensional database has been set up that currently classifies 105,950 loop fragments and loop flanking secondary structures. Besides the length of the loops and types of bracing secondary structures the database is organized along four internal coordinates, a distance and three types of angles characterizing the geometry of stem regions. Candidate fragments are selected from this library by matching the length, the types of bracing secondary structures of the query and satisfying the geometrical restraints of the stems and subsequently inserted in the query protein framework where their fit is assessed by the root mean square deviation (r.m.s.d.) of stem regions and by the number of rigid body clashes with the environment. In the final step remaining candidate loops are ranked by a Z-score that combines information on sequence similarity and fit of predicted and observed phi/psi main chain dihedral angle propensities. Confidence Z-score cut-offs were determined for each loop length that identify those predicted fragments that outperform a competitive ab initio method. A web server implements the method, regularly updates the fragment library and performs prediction. Predicted segments are returned, or optionally, these can be completed with side chain reconstruction and subsequently annealed in the environment of the query protein by conjugate gradient minimization. The prediction method was tested on artificially prepared search datasets where all trivial sequence similarities on the SCOP superfamily level were removed. Under these conditions it is possible to predict loops of length 4, 8 and 12 with coverage of 98, 78 and 28% with at least of 0.22, 1.38 and 2.47 A of r.m.s.d. accuracy, respectively. In a head-to-head comparison on loops extracted from freshly deposited new protein folds the current method outperformed in a approximately 5:1 ratio an earlier developed database search method.

UI MeSH Term Description Entries
D008958 Models, Molecular Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer-generated graphics, and mechanical structures. Molecular Models,Model, Molecular,Molecular Model
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D015394 Molecular Structure The location of the atoms, groups or ions relative to one another in a molecule, as well as the number, type and location of covalent bonds. Structure, Molecular,Molecular Structures,Structures, Molecular
D017433 Protein Structure, Secondary The level of protein structure in which regular hydrogen-bond interactions within contiguous stretches of polypeptide chain give rise to ALPHA-HELICES; BETA-STRANDS (which align to form BETA-SHEETS), or other types of coils. This is the first folding level of protein conformation. Secondary Protein Structure,Protein Structures, Secondary,Secondary Protein Structures,Structure, Secondary Protein,Structures, Secondary Protein
D017510 Protein Folding Processes involved in the formation of TERTIARY PROTEIN STRUCTURE. Protein Folding, Globular,Folding, Globular Protein,Folding, Protein,Foldings, Globular Protein,Foldings, Protein,Globular Protein Folding,Globular Protein Foldings,Protein Foldings,Protein Foldings, Globular
D020407 Internet A loose confederation of computer communication networks around the world. The networks that make up the Internet are connected through several backbone networks. The Internet grew out of the US Government ARPAnet project and was designed to facilitate information exchange. World Wide Web,Cyber Space,Cyberspace,Web, World Wide,Wide Web, World
D020539 Sequence Analysis, Protein A process that includes the determination of AMINO ACID SEQUENCE of a protein (or peptide, oligopeptide or peptide fragment) and the information analysis of the sequence. Amino Acid Sequence Analysis,Peptide Sequence Analysis,Protein Sequence Analysis,Sequence Determination, Protein,Amino Acid Sequence Analyses,Amino Acid Sequence Determination,Amino Acid Sequence Determinations,Amino Acid Sequencing,Peptide Sequence Determination,Protein Sequencing,Sequence Analyses, Amino Acid,Sequence Analysis, Amino Acid,Sequence Analysis, Peptide,Sequence Determination, Amino Acid,Sequence Determinations, Amino Acid,Acid Sequencing, Amino,Analyses, Peptide Sequence,Analyses, Protein Sequence,Analysis, Peptide Sequence,Analysis, Protein Sequence,Peptide Sequence Analyses,Peptide Sequence Determinations,Protein Sequence Analyses,Protein Sequence Determination,Protein Sequence Determinations,Sequence Analyses, Peptide,Sequence Analyses, Protein,Sequence Determination, Peptide,Sequence Determinations, Peptide,Sequence Determinations, Protein,Sequencing, Amino Acid,Sequencing, Protein
D030562 Databases, Protein Databases containing information about PROTEINS such as AMINO ACID SEQUENCE; PROTEIN CONFORMATION; and other properties. Amino Acid Sequence Databases,Databases, Amino Acid Sequence,Protein Databases,Protein Sequence Databases,SWISS-PROT,Protein Structure Databases,SwissProt,Database, Protein,Database, Protein Sequence,Database, Protein Structure,Databases, Protein Sequence,Databases, Protein Structure,Protein Database,Protein Sequence Database,Protein Structure Database,SWISS PROT,Sequence Database, Protein,Sequence Databases, Protein,Structure Database, Protein,Structure Databases, Protein

Related Publications

Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
August 2009, PLoS computational biology,
Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
September 2000, Protein science : a publication of the Protein Society,
Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
May 1998, Proteins,
Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
October 1988, Nature,
Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
December 2003, Bioinformatics (Oxford, England),
Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
January 2009, International journal of bioinformatics research and applications,
Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
April 2004, BMC bioinformatics,
Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
December 2018, Journal of structural biology,
Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
December 1989, Proceedings of the National Academy of Sciences of the United States of America,
Narcis Fernandez-Fuentes, and Baldomero Oliva, and András Fiser
July 2023, Journal of molecular biology,
Copied contents to your clipboard!