Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. 1997

J A Taylor, and R S Johnson
Department of Biochemistry, University of Washington, Seattle 98195-7350, USA.

A method is described for searching protein sequence databases using tandem mass spectra of tryptic peptides. The approach uses a de novo sequencing algorithm to derive a short list of possible sequence candidates which serve as query sequences in a subsequent homology-based database search routine. The sequencing algorithm employs a graph theory approach similar to previously described sequencing programs. In addition, amino acid composition, peptide sequence tags and incomplete or ambiguous Edman sequence data can be used to aid in the sequence determinations. Although sequencing of peptides from tandem mass spectra is possible, one of the frequently encountered difficulties is that several alternative sequences can be deduced from one spectrum. Most of the alternative sequences, however, are sufficiently similar for a homology-based sequence database search to be possible. Unfortunately, the available protein sequence database search algorithms (e.g. Blast or FASTA) require a single unambiguous sequence as input. Here we describe how the publicly available FASTA computer program was modified in order to search protein databases more effectively in spite of the ambiguities intrinsic in de novo peptide sequencing algorithms.

UI MeSH Term Description Entries
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D010455 Peptides Members of the class of compounds composed of AMINO ACIDS joined together by peptide bonds between adjacent amino acids into linear, branched or cyclical structures. OLIGOPEPTIDES are composed of approximately 2-12 amino acids. Polypeptides are composed of approximately 13 or more amino acids. PROTEINS are considered to be larger versions of peptides that can form into complex structures such as ENZYMES and RECEPTORS. Peptide,Polypeptide,Polypeptides
D003628 Database Management Systems Software designed to store, manipulate, manage, and control data for specific uses. Data Base Management Systems,Management System, Data Base,Management Systems, Data Base,System, Data Base Management,Systems, Data Base Management,Database Management System
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D006868 Hydrolysis The process of cleaving a chemical compound by the addition of a molecule of water.
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D000818 Animals Unicellular or multicellular, heterotrophic organisms, that have sensation and the power of voluntary movement. Under the older five kingdom paradigm, Animalia was one of the kingdoms. Under the modern three domain model, Animalia represents one of the many groups in the domain EUKARYOTA. Animal,Metazoa,Animalia
D013058 Mass Spectrometry An analytical method used in determining the identity of a chemical based on its mass using mass analyzers/mass spectrometers. Mass Spectroscopy,Spectrometry, Mass,Spectroscopy, Mass,Spectrum Analysis, Mass,Analysis, Mass Spectrum,Mass Spectrum Analysis,Analyses, Mass Spectrum,Mass Spectrum Analyses,Spectrum Analyses, Mass
D014357 Trypsin A serine endopeptidase that is formed from TRYPSINOGEN in the pancreas. It is converted into its active form by ENTEROPEPTIDASE in the small intestine. It catalyzes hydrolysis of the carboxyl group of either arginine or lysine. EC 3.4.21.4. Tripcellim,Trypure,beta-Trypsin,beta Trypsin

Related Publications

J A Taylor, and R S Johnson
January 1999, Journal of computational biology : a journal of computational molecular cell biology,
J A Taylor, and R S Johnson
January 2015, Mass spectrometry reviews,
J A Taylor, and R S Johnson
January 2003, Journal of computational biology : a journal of computational molecular cell biology,
J A Taylor, and R S Johnson
January 2001, Journal of computational biology : a journal of computational molecular cell biology,
J A Taylor, and R S Johnson
January 2003, Rapid communications in mass spectrometry : RCM,
J A Taylor, and R S Johnson
October 2003, Current opinion in structural biology,
J A Taylor, and R S Johnson
February 2007, Analytical chemistry,
J A Taylor, and R S Johnson
September 2016, Proteomics,
J A Taylor, and R S Johnson
January 2009, Molecular & cellular proteomics : MCP,
Copied contents to your clipboard!