Peptide mass fingerprinting peak intensity prediction: extracting knowledge from spectra. 2002

Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
Swiss Institute of Bioinformatics, Geneva, Switzerland.

Matrix-assisted laser desorption/ionization-time of flight mass spectrometry has become a valuable tool in proteomics. With the increasing acquisition rate of mass spectrometers, one of the major issues is the development of accurate, efficient and automatic peptide mass fingerprinting (PMF) identification tools. Current tools are mostly based on counting the number of experimental peptide masses matching with theoretical masses. Almost all of them use additional criteria such as isoelectric point, molecular weight, PTMs, taxonomy or enzymatic cleavage rules to enhance prediction performance. However, these identification tools seldom use peak intensities as parameter as there is currently no model predicting the intensities based on the physicochemical properties of peptides. In this work, we used standard datamining methods such as classification and regression methods to find correlations between peak intensities and the properties of the peptides composing a PMF spectrum. These methods were applied on a dataset comprising a series of PMF experiments involving 157 proteins. We found that the C4.5 method gave the more informative results for the classification task (prediction of the presence or absence of a peptide in a spectra) and M5' for the regression methods (prediction of the normalized intensity of a peptide peak). The C4.5 result correctly classified 88% of the theoretical peaks; whereas the M5' peak intensities had a correlation coefficient of 0.6743 with the experimental peak intensities. These methods enabled us to obtain decision and model trees that can be directly used for prediction and identification of PMF results. The work performed permitted to lay the foundations of a method to analyze factors influencing the peak intensity of PMF spectra. A simple extension of this analysis could lead to improve the accuracy of the results by using a larger dataset. Additional peptide characteristics or even PMF experimental parameters can also be taken into account in the datamining process to analyze their influence on the peak intensity. Furthermore, this datamining approach can certainly be extended to the tandem mass spectrometry domain or other mass spectrometry derived methods.

UI MeSH Term Description Entries
D010455 Peptides Members of the class of compounds composed of AMINO ACIDS joined together by peptide bonds between adjacent amino acids into linear, branched or cyclical structures. OLIGOPEPTIDES are composed of approximately 2-12 amino acids. Polypeptides are composed of approximately 13 or more amino acids. PROTEINS are considered to be larger versions of peptides that can form into complex structures such as ENZYMES and RECEPTORS. Peptide,Polypeptide,Polypeptides
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D006863 Hydrogen-Ion Concentration The normality of a solution with respect to HYDROGEN ions; H+. It is related to acidity measurements in most cases by pH pH,Concentration, Hydrogen-Ion,Concentrations, Hydrogen-Ion,Hydrogen Ion Concentration,Hydrogen-Ion Concentrations
D000199 Actins Filamentous proteins that are the main constituent of the thin filaments of muscle fibers. The filaments (known also as filamentous or F-actin) can be dissociated into their globular subunits; each subunit is composed of a single polypeptide 375 amino acids long. This is known as globular or G-actin. In conjunction with MYOSINS, actin is responsible for the contraction and relaxation of muscle. F-Actin,G-Actin,Actin,Isoactin,N-Actin,alpha-Actin,alpha-Isoactin,beta-Actin,gamma-Actin,F Actin,G Actin,N Actin,alpha Actin,alpha Isoactin,beta Actin,gamma Actin
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D013223 Statistics as Topic Works about the science and art of collecting, summarizing, and analyzing data that are subject to random variation. Area Analysis,Estimation Technics,Estimation Techniques,Indirect Estimation Technics,Indirect Estimation Techniques,Multiple Classification Analysis,Service Statistics,Statistical Study,Statistics, Service,Tables and Charts as Topic,Analyses, Area,Analyses, Multiple Classification,Area Analyses,Classification Analyses, Multiple,Classification Analysis, Multiple,Estimation Technic, Indirect,Estimation Technics, Indirect,Estimation Technique,Estimation Technique, Indirect,Estimation Techniques, Indirect,Indirect Estimation Technic,Indirect Estimation Technique,Multiple Classification Analyses,Statistical Studies,Studies, Statistical,Study, Statistical,Technic, Indirect Estimation,Technics, Estimation,Technics, Indirect Estimation,Technique, Estimation,Technique, Indirect Estimation,Techniques, Estimation,Techniques, Indirect Estimation
D014357 Trypsin A serine endopeptidase that is formed from TRYPSINOGEN in the pancreas. It is converted into its active form by ENTEROPEPTIDASE in the small intestine. It catalyzes hydrolysis of the carboxyl group of either arginine or lysine. EC 3.4.21.4. Tripcellim,Trypure,beta-Trypsin,beta Trypsin
D015180 Electrophoresis, Gel, Two-Dimensional Electrophoresis in which a second perpendicular electrophoretic transport is performed on the separate components resulting from the first electrophoresis. This technique is usually performed on polyacrylamide gels. Gel Electrophoresis, Two-Dimensional,Polyacrylamide Gel Electrophoresis, Two-Dimensional,2-D Gel Electrophoresis,2-D Polyacrylamide Gel Electrophoresis,2D Gel Electrophoresis,2D PAGE,2D Polyacrylamide Gel Electrophoresis,Electrophoresis, Gel, 2-D,Electrophoresis, Gel, 2D,Electrophoresis, Gel, Two Dimensional,Polyacrylamide Gel Electrophoresis, 2-D,Polyacrylamide Gel Electrophoresis, 2D,Two Dimensional Gel Electrophoresis,2 D Gel Electrophoresis,2 D Polyacrylamide Gel Electrophoresis,Electrophoresis, 2-D Gel,Electrophoresis, 2D Gel,Electrophoresis, Two-Dimensional Gel,Gel Electrophoresis, 2-D,Gel Electrophoresis, 2D,Gel Electrophoresis, Two Dimensional,PAGE, 2D,Polyacrylamide Gel Electrophoresis, 2 D,Polyacrylamide Gel Electrophoresis, Two Dimensional,Two-Dimensional Gel Electrophoresis
D015233 Models, Statistical Statistical formulations or analyses which, when applied to data and found to fit the data, are then used to verify the assumptions and parameters used in the analysis. Examples of statistical models are the linear model, binomial model, polynomial model, two-parameter model, etc. Probabilistic Models,Statistical Models,Two-Parameter Models,Model, Statistical,Models, Binomial,Models, Polynomial,Statistical Model,Binomial Model,Binomial Models,Model, Binomial,Model, Polynomial,Model, Probabilistic,Model, Two-Parameter,Models, Probabilistic,Models, Two-Parameter,Polynomial Model,Polynomial Models,Probabilistic Model,Two Parameter Models,Two-Parameter Model
D019032 Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization A mass spectrometric technique that is used for the analysis of large biomolecules. Analyte molecules are embedded in an excess matrix of small organic molecules that show a high resonant absorption at the laser wavelength used. The matrix absorbs the laser energy, thus inducing a soft disintegration of the sample-matrix mixture into free (gas phase) matrix and analyte molecules and molecular ions. In general, only molecular ions of the analyte molecules are produced, and almost no fragmentation occurs. This makes the method well suited for molecular weight determinations and mixture analysis. Laser Desorption-Ionization Mass Spectrometry, Matrix-Assisted,MALD-MS,MALDI,Mass Spectrometry, Matrix-Assisted Laser Desorption-Ionization,Mass Spectroscopy, Matrix-Assisted Laser Desorption-Ionization,Matrix-Assisted Laser Desorption-Ionization Mass Spectrometry,Spectroscopy, Mass, Matrix-Assisted Laser Desorption-Ionization,MALDI-MS,MS-MALD,SELDI-TOF-MS,Surface Enhanced Laser Desorption Ionization Mass Spectrometry,Laser Desorption Ionization Mass Spectrometry, Matrix Assisted,MALDI MS,Mass Spectrometry, Matrix Assisted Laser Desorption Ionization,Mass Spectroscopy, Matrix Assisted Laser Desorption Ionization,Matrix Assisted Laser Desorption Ionization Mass Spectrometry

Related Publications

Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
May 2008, Bioinformatics (Oxford, England),
Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
January 2008, Journal of proteome research,
Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
March 2015, International journal of mass spectrometry,
Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
March 2005, Methods (San Diego, Calif.),
Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
January 2008, Cancer informatics,
Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
January 2016, PloS one,
Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
January 2007, Methods in molecular biology (Clifton, N.J.),
Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
June 2009, Science (New York, N.Y.),
Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
July 2010, Molecular & cellular proteomics : MCP,
Steven Gay, and Pierre-Alain Binz, and Denis F Hochstrasser, and Ron D Appel
October 2018, Journal of the American Society for Mass Spectrometry,
Copied contents to your clipboard!