Improving protein secondary structure prediction with aligned homologous sequences. 1996

V Di Francesco, and J Garnier, and P J Munson
NIH/DCRT/LSB, Bethesda, Maryland 20892-5626, USA. valedf@helix.nih.gov

Most recent protein secondary structure prediction methods use sequence alignments to improve the prediction quality. We investigate the relationship between the location of secondary structural elements, gaps, and variable residue positions in multiple sequence alignments. We further investigate how these relationships compare with those found in structurally aligned protein families. We show how such associations may be used to improve the quality of prediction of the secondary structure elements, using the Quadratic-Logistic method with profiles. Furthermore, we analyze the extent to which the number of homologous sequences influences the quality of prediction. The analysis of variable residue positions shows that surprisingly, helical regions exhibit greater variability than do coil regions, which are generally thought to be the most common secondary structure elements in loops. However, the correlation between variability and the presence of helices does not significantly improve prediction quality. Gaps are a distinct signal for coil regions. Increasing the coil propensity for those residues occurring in gap regions enhances the overall prediction quality. Prediction accuracy increases initially with the number of homologues, but changes negligibly as the number of homologues exceeds about 14. The alignment quality affects the prediction more than other factors, hence a careful selection and alignment of even a small number of homologues can lead to significant improvements in prediction accuracy.

UI MeSH Term Description Entries
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D000818 Animals Unicellular or multicellular, heterotrophic organisms, that have sensation and the power of voluntary movement. Under the older five kingdom paradigm, Animalia was one of the kingdoms. Under the modern three domain model, Animalia represents one of the many groups in the domain EUKARYOTA. Animal,Metazoa,Animalia
D016254 Mutagenesis, Insertional Mutagenesis where the mutation is caused by the introduction of foreign DNA sequences into a gene or extragenic sequence. This may occur spontaneously in vivo or be experimentally induced in vivo or in vitro. Proviral DNA insertions into or adjacent to a cellular proto-oncogene can interrupt GENETIC TRANSLATION of the coding sequences or interfere with recognition of regulatory elements and cause unregulated expression of the proto-oncogene resulting in tumor formation. Gene Insertion,Insertion Mutation,Insertional Activation,Insertional Mutagenesis,Linker-Insertion Mutagenesis,Mutagenesis, Cassette,Sequence Insertion,Viral Insertional Mutagenesis,Activation, Insertional,Activations, Insertional,Cassette Mutagenesis,Gene Insertions,Insertion Mutations,Insertion, Gene,Insertion, Sequence,Insertional Activations,Insertional Mutagenesis, Viral,Insertions, Gene,Insertions, Sequence,Linker Insertion Mutagenesis,Mutagenesis, Linker-Insertion,Mutagenesis, Viral Insertional,Mutation, Insertion,Mutations, Insertion,Sequence Insertions
D017384 Sequence Deletion Deletion of sequences of nucleic acids from the genetic material of an individual. Deletion Mutation,Deletion Mutations,Deletion, Sequence,Deletions, Sequence,Mutation, Deletion,Mutations, Deletion,Sequence Deletions
D017386 Sequence Homology, Amino Acid The degree of similarity between sequences of amino acids. This information is useful for the analyzing genetic relatedness of proteins and species. Homologous Sequences, Amino Acid,Amino Acid Sequence Homology,Homologs, Amino Acid Sequence,Homologs, Protein Sequence,Homology, Protein Sequence,Protein Sequence Homologs,Protein Sequence Homology,Sequence Homology, Protein,Homolog, Protein Sequence,Homologies, Protein Sequence,Protein Sequence Homolog,Protein Sequence Homologies,Sequence Homolog, Protein,Sequence Homologies, Protein,Sequence Homologs, Protein
D017433 Protein Structure, Secondary The level of protein structure in which regular hydrogen-bond interactions within contiguous stretches of polypeptide chain give rise to ALPHA-HELICES; BETA-STRANDS (which align to form BETA-SHEETS), or other types of coils. This is the first folding level of protein conformation. Secondary Protein Structure,Protein Structures, Secondary,Secondary Protein Structures,Structure, Secondary Protein,Structures, Secondary Protein

Related Publications

V Di Francesco, and J Garnier, and P J Munson
June 2002, Journal of molecular biology,
V Di Francesco, and J Garnier, and P J Munson
January 1991, Methods in enzymology,
V Di Francesco, and J Garnier, and P J Munson
January 2015, Methods in molecular biology (Clifton, N.J.),
V Di Francesco, and J Garnier, and P J Munson
January 2011, Nucleic acids research,
V Di Francesco, and J Garnier, and P J Munson
June 1987, Journal of molecular biology,
V Di Francesco, and J Garnier, and P J Munson
December 1995, Protein science : a publication of the Protein Society,
V Di Francesco, and J Garnier, and P J Munson
May 2011, Protein engineering, design & selection : PEDS,
V Di Francesco, and J Garnier, and P J Munson
July 2022, Bioinformatics (Oxford, England),
V Di Francesco, and J Garnier, and P J Munson
December 2010, BMC genomics,
Copied contents to your clipboard!