A domain-based approach to predict protein-protein interactions. 2007

Mudita Singhal, and Haluk Resat
Computational Biology and Bioinformatics Group, Pacific Northwest National Laboratory, Richland, WA 99352, USA. mudita.singhal@pnl.gov <mudita.singhal@pnl.gov>

BACKGROUND Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI) networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins. RESULTS DomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms. CONCLUSIONS We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed using the DomainGA scores are reasonably low, and the erroneous predictions can be filtered further using supplementary approaches such as those based on literature search or other prediction methods.

UI MeSH Term Description Entries
D008956 Models, Chemical Theoretical representations that simulate the behavior or activity of chemical processes or phenomena; includes the use of mathematical equations, computers, and other electronic equipment. Chemical Models,Chemical Model,Model, Chemical
D008958 Models, Molecular Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer-generated graphics, and mechanical structures. Molecular Models,Model, Molecular,Molecular Model
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D011485 Protein Binding The process in which substances, either endogenous or exogenous, bind to proteins, peptides, enzymes, protein precursors, or allied compounds. Specific protein-binding measures are often used as assays in diagnostic assessments. Plasma Protein Binding Capacity,Binding, Protein
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D003198 Computer Simulation Computer-based representation of physical systems and phenomena such as chemical processes. Computational Modeling,Computational Modelling,Computer Models,In silico Modeling,In silico Models,In silico Simulation,Models, Computer,Computerized Models,Computer Model,Computer Simulations,Computerized Model,In silico Model,Model, Computer,Model, Computerized,Model, In silico,Modeling, Computational,Modeling, In silico,Modelling, Computational,Simulation, Computer,Simulation, In silico,Simulations, Computer
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D001665 Binding Sites The parts of a macromolecule that directly participate in its specific combination with another molecule. Combining Site,Binding Site,Combining Sites,Site, Binding,Site, Combining,Sites, Binding,Sites, Combining
D017434 Protein Structure, Tertiary The level of protein structure in which combinations of secondary protein structures (ALPHA HELICES; BETA SHEETS; loop regions, and AMINO ACID MOTIFS) pack together to form folded shapes. Disulfide bridges between cysteines in two different parts of the polypeptide chain along with other interactions between the chains play a role in the formation and stabilization of tertiary structure. Tertiary Protein Structure,Protein Structures, Tertiary,Tertiary Protein Structures

Related Publications

Mudita Singhal, and Haluk Resat
December 2008, Journal of bioinformatics and computational biology,
Mudita Singhal, and Haluk Resat
July 2009, Bioinformatics (Oxford, England),
Mudita Singhal, and Haluk Resat
October 2019, PLoS computational biology,
Mudita Singhal, and Haluk Resat
January 2017, Methods in molecular biology (Clifton, N.J.),
Mudita Singhal, and Haluk Resat
February 2013, Chembiochem : a European journal of chemical biology,
Mudita Singhal, and Haluk Resat
January 2014, Biochimica et biophysica acta,
Copied contents to your clipboard!