Bayesian classification of residues associated with protein functional divergence: Arf and Arf-like GTPases. 2010

Andrew F Neuwald
Department of Biochemistry & Molecular Biology, Institute for Genome Sciences, University of Maryland School of Medicine, BioPark II, Room 617, 801 West Baltimore St, Baltimore, MD 21201, USA. aneuwald@som.umaryland.edu

BACKGROUND Certain residues within proteins are highly conserved across very distantly related organisms, yet their (presumably critical) structural or mechanistic roles are completely unknown. To obtain clues regarding such residues within Arf and Arf-like (Arf/Arl) GTPases--which function as on/off switches regulating vesicle trafficking, phospholipid metabolism and cytoskeletal remodeling--I apply a new sampling procedure for comparative sequence analysis, termed multiple category Bayesian Partitioning with Pattern Selection (mcBPPS). RESULTS The mcBPPS sampler classified sequences within the entire P-loop GTPase class into multiple categories by identifying those evolutionarily-divergent residues most likely to be responsible for functional specialization. Here I focus on categories of residues that most distinguish various Arf/Arl GTPases from other GTPases. This identified residues whose specific roles have been previously proposed (and in some cases corroborated experimentally and that thus serve as positive controls), as well as several categories of co-conserved residues whose possible roles are first hinted at here. For example, Arf/Arl/Sar GTPases are most distinguished from other GTPases by a conserved aspartate residue within the phosphate binding loop (P-loop) and by co-conserved residues nearby that, together, can form a network of salt-bridge and hydrogen bond interactions centered on the GTPase active site. Residues corresponding to an N-[VI] motif that is conserved within Arf/Arl GTPases may play a role in the interswitch toggle characteristic of the Arf family, whereas other, co-conserved residues may modulate the flexibility of the guanine binding loop. Arl8 GTPases conserve residues that strikingly diverge from those typically found in other Arf/Arl GTPases and that form structural interactions suggestive of a novel interswitch toggle mechanism. CONCLUSIONS This analysis suggests specific mutagenesis experiments to explore mechanisms underlying GTP hydrolysis, nucleotide exchange and interswitch toggling within Arf/Arl GTPases. More generally, it illustrates how the mcBPPS sampler can complement traditional evolutionary analyses by providing an objective, quantitative and statistically rigorous way to explore protein functional-divergence in molecular detail. Because the sampler classifies the input sequences at the same time, it can be used to generate subgroup profiles, in which functionally-divergent categories of residues are annotated automatically.

UI MeSH Term Description Entries
D008565 Membrane Proteins Proteins which are found in membranes including cellular and intracellular membranes. They consist of two types, peripheral and integral proteins. They include most membrane-associated enzymes, antigenic proteins, transport proteins, and drug, hormone, and lectin receptors. Cell Membrane Protein,Cell Membrane Proteins,Cell Surface Protein,Cell Surface Proteins,Integral Membrane Proteins,Membrane-Associated Protein,Surface Protein,Surface Proteins,Integral Membrane Protein,Membrane Protein,Membrane-Associated Proteins,Membrane Associated Protein,Membrane Associated Proteins,Membrane Protein, Cell,Membrane Protein, Integral,Membrane Proteins, Integral,Protein, Cell Membrane,Protein, Cell Surface,Protein, Integral Membrane,Protein, Membrane,Protein, Membrane-Associated,Protein, Surface,Proteins, Cell Membrane,Proteins, Cell Surface,Proteins, Integral Membrane,Proteins, Membrane,Proteins, Membrane-Associated,Proteins, Surface,Surface Protein, Cell
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D000818 Animals Unicellular or multicellular, heterotrophic organisms, that have sensation and the power of voluntary movement. Under the older five kingdom paradigm, Animalia was one of the kingdoms. Under the modern three domain model, Animalia represents one of the many groups in the domain EUKARYOTA. Animal,Metazoa,Animalia
D001499 Bayes Theorem A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result. Bayesian Analysis,Bayesian Estimation,Bayesian Forecast,Bayesian Method,Bayesian Prediction,Analysis, Bayesian,Bayesian Approach,Approach, Bayesian,Approachs, Bayesian,Bayesian Approachs,Estimation, Bayesian,Forecast, Bayesian,Method, Bayesian,Prediction, Bayesian,Theorem, Bayes
D015003 Yeasts A general term for single-celled rounded fungi that reproduce by budding. Brewers' and bakers' yeasts are SACCHAROMYCES CEREVISIAE; therapeutic dried yeast is YEAST, DRIED. Yeast
D016415 Sequence Alignment The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms. Sequence Homology Determination,Determination, Sequence Homology,Alignment, Sequence,Alignments, Sequence,Determinations, Sequence Homology,Sequence Alignments,Sequence Homology Determinations
D017124 Conserved Sequence A sequence of amino acids in a polypeptide or of nucleotides in DNA or RNA that is similar across multiple species. A known set of conserved sequences is represented by a CONSENSUS SEQUENCE. AMINO ACID MOTIFS are often composed of conserved sequences. Conserved Sequences,Sequence, Conserved,Sequences, Conserved
D017421 Sequence Analysis A multistage process that includes the determination of a sequence (protein, carbohydrate, etc.), its fragmentation and analysis, and the interpretation of the resulting sequence information. Sequence Determination,Analysis, Sequence,Determination, Sequence,Determinations, Sequence,Sequence Determinations,Analyses, Sequence,Sequence Analyses

Related Publications

Andrew F Neuwald
December 2004, Trends in cell biology,
Andrew F Neuwald
September 2013, Experimental cell research,
Andrew F Neuwald
August 2020, American journal of physiology. Cell physiology,
Andrew F Neuwald
August 2000, Current opinion in cell biology,
Andrew F Neuwald
August 2005, Biochemical Society transactions,
Andrew F Neuwald
January 2008, Traffic (Copenhagen, Denmark),
Andrew F Neuwald
January 1996, Journal of cell science,
Andrew F Neuwald
January 2015, Journal of immunology research,
Copied contents to your clipboard!