An intelligent system for identifying acetylated lysine on histones and nonhistone proteins. 2014

Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan.

Lysine acetylation is an important and ubiquitous posttranslational modification conserved in prokaryotes and eukaryotes. This process, which is dynamically and temporally regulated by histone acetyltransferases and deacetylases, is crucial for numerous essential biological processes such as transcriptional regulation, cellular signaling, and stress response. Since the experimental identification of lysine acetylation sites within proteins is time-consuming and laboratory-intensive, several computational approaches have been developed to identify candidates for experimental validation. In this work, acetylated protein data collected from UniProtKB were categorized into histone or nonhistone proteins. Support vector machines (SVMs) were applied to build predictive models by using amino acid pair composition (AAPC) as a feature in a histone model. We combined BLOSUM62 and AAPC features in a nonhistone model. Furthermore, using maximal dependence decomposition (MDD) clustering can enhance the performance of the model on a fivefold cross-validation evaluation to yield a sensitivity of 0.863, specificity of 0.885, accuracy of 0.880, and MCC of 0.706. Additionally, the proposed method is evaluated using independent test sets resulting in a predictive accuracy of 74%. This indicates that the performance of our method is comparable with that of other acetylation prediction methods.

UI MeSH Term Description Entries
D008239 Lysine An essential amino acid. It is often added to animal feed. Enisyl,L-Lysine,Lysine Acetate,Lysine Hydrochloride,Acetate, Lysine,L Lysine
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D006657 Histones Small chromosomal proteins (approx 12-20 kD) possessing an open, unfolded structure and attached to the DNA in cell nuclei by ionic linkages. Classification into the various types (designated histone I, histone II, etc.) is based on the relative amounts of arginine and lysine in each. Histone,Histone H1,Histone H1(s),Histone H2a,Histone H2b,Histone H3,Histone H3.3,Histone H4,Histone H5,Histone H7
D000107 Acetylation Formation of an acetyl derivative. (Stedman, 25th ed) Acetylations
D000596 Amino Acids Organic compounds that generally contain an amino (-NH2) and a carboxyl (-COOH) group. Twenty alpha-amino acids are the subunits which are polymerized to form proteins. Amino Acid,Acid, Amino,Acids, Amino
D060388 Support Vector Machine SUPERVISED MACHINE LEARNING algorithm which learns to assign labels to objects from a set of training examples. Examples are learning to recognize fraudulent credit card activity by examining hundreds or thousands of fraudulent and non-fraudulent credit card activity, or learning to make disease diagnosis or prognosis based on automatic classification of microarray gene expression profiles drawn from hundreds or thousands of samples. Support Vector Network,Machine, Support Vector,Machines, Support Vector,Network, Support Vector,Networks, Support Vector,Support Vector Machines,Support Vector Networks,Vector Machine, Support,Vector Machines, Support,Vector Network, Support,Vector Networks, Support
D019295 Computational Biology A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets. Bioinformatics,Molecular Biology, Computational,Bio-Informatics,Biology, Computational,Computational Molecular Biology,Bio Informatics,Bio-Informatic,Bioinformatic,Biologies, Computational Molecular,Biology, Computational Molecular,Computational Molecular Biologies,Molecular Biologies, Computational
D030562 Databases, Protein Databases containing information about PROTEINS such as AMINO ACID SEQUENCE; PROTEIN CONFORMATION; and other properties. Amino Acid Sequence Databases,Databases, Amino Acid Sequence,Protein Databases,Protein Sequence Databases,SWISS-PROT,Protein Structure Databases,SwissProt,Database, Protein,Database, Protein Sequence,Database, Protein Structure,Databases, Protein Sequence,Databases, Protein Structure,Protein Database,Protein Sequence Database,Protein Structure Database,SWISS PROT,Sequence Database, Protein,Sequence Databases, Protein,Structure Database, Protein,Structure Databases, Protein

Related Publications

Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
October 2017, Journal of proteome research,
Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
December 2022, The Journal of biological chemistry,
Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
January 2012, Methods in molecular biology (Clifton, N.J.),
Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
February 2021, Journal of proteomics,
Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
January 1978, Ontogenez,
Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
November 1976, Biochemistry,
Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
January 2020, ACS chemical biology,
Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
November 2019, Science China. Life sciences,
Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
April 2012, Biochimica et biophysica acta,
Cheng-Tsung Lu, and Tzong-Yi Lee, and Yu-Ju Chen, and Yi-Ju Chen
January 1974, Acta biochimica Polonica,
Copied contents to your clipboard!