A single ancient origin for prototypical serine/arginine-rich splicing factors. 2012

Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
Laboratory of Functional Genomics and Plant Molecular Imaging and Centre for Assistance in Technology of Microscopy, Department of Life Sciences, Institute of Botany, University of Liège, B-4000 Liege, Belgium.

Eukaryotic precursor mRNA splicing is a process involving a very complex RNA-protein edifice. Serine/arginine-rich (SR) proteins play essential roles in precursor mRNA constitutive and alternative splicing and have been suggested to be crucial in plant-specific forms of developmental regulation and environmental adaptation. Despite their functional importance, little is known about their origin and evolutionary history. SR splicing factors have a modular organization featuring at least one RNA recognition motif (RRM) domain and a carboxyl-terminal region enriched in serine/arginine dipeptides. To investigate the evolution of SR proteins, we infer phylogenies for more than 12,000 RRM domains representing more than 200 broadly sampled organisms. Our analyses reveal that the RRM domain is not restricted to eukaryotes and that all prototypical SR proteins share a single ancient origin, including the plant-specific SR45 protein. Based on these findings, we propose a scenario for their diversification into four natural families, each corresponding to a main SR architecture, and a dozen subfamilies, of which we profile both sequence conservation and composition. Finally, using operational criteria for computational discovery and classification, we catalog SR proteins in 20 model organisms, with a focus on green algae and land plants. Altogether, our study confirms the homogeneity and antiquity of SR splicing factors while establishing robust phylogenetic relationships between animal and plant proteins, which should enable functional analyses of lesser characterized SR family members, especially in green plants.

UI MeSH Term Description Entries
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D010802 Phylogeny The relationships of groups of organisms as reflected by their genetic makeup. Community Phylogenetics,Molecular Phylogenetics,Phylogenetic Analyses,Phylogenetic Analysis,Phylogenetic Clustering,Phylogenetic Comparative Analysis,Phylogenetic Comparative Methods,Phylogenetic Distance,Phylogenetic Generalized Least Squares,Phylogenetic Groups,Phylogenetic Incongruence,Phylogenetic Inference,Phylogenetic Networks,Phylogenetic Reconstruction,Phylogenetic Relatedness,Phylogenetic Relationships,Phylogenetic Signal,Phylogenetic Structure,Phylogenetic Tree,Phylogenetic Trees,Phylogenomics,Analyse, Phylogenetic,Analysis, Phylogenetic,Analysis, Phylogenetic Comparative,Clustering, Phylogenetic,Community Phylogenetic,Comparative Analysis, Phylogenetic,Comparative Method, Phylogenetic,Distance, Phylogenetic,Group, Phylogenetic,Incongruence, Phylogenetic,Inference, Phylogenetic,Method, Phylogenetic Comparative,Molecular Phylogenetic,Network, Phylogenetic,Phylogenetic Analyse,Phylogenetic Clusterings,Phylogenetic Comparative Analyses,Phylogenetic Comparative Method,Phylogenetic Distances,Phylogenetic Group,Phylogenetic Incongruences,Phylogenetic Inferences,Phylogenetic Network,Phylogenetic Reconstructions,Phylogenetic Relatednesses,Phylogenetic Relationship,Phylogenetic Signals,Phylogenetic Structures,Phylogenetic, Community,Phylogenetic, Molecular,Phylogenies,Phylogenomic,Reconstruction, Phylogenetic,Relatedness, Phylogenetic,Relationship, Phylogenetic,Signal, Phylogenetic,Structure, Phylogenetic,Tree, Phylogenetic
D010940 Plant Proteins Proteins found in plants (flowers, herbs, shrubs, trees, etc.). The concept does not include proteins found in vegetables for which PLANT PROTEINS, DIETARY is available. Plant Protein,Protein, Plant,Proteins, Plant
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D001120 Arginine An essential amino acid that is physiologically active in the L-form. Arginine Hydrochloride,Arginine, L-Isomer,DL-Arginine Acetate, Monohydrate,L-Arginine,Arginine, L Isomer,DL Arginine Acetate, Monohydrate,Hydrochloride, Arginine,L Arginine,L-Isomer Arginine,Monohydrate DL-Arginine Acetate
D012326 RNA Splicing The ultimate exclusion of nonsense sequences or intervening sequences (introns) before the final RNA transcript is sent to the cytoplasm. RNA, Messenger, Splicing,Splicing, RNA,RNA Splicings,Splicings, RNA
D012333 RNA, Messenger RNA sequences that serve as templates for protein synthesis. Bacterial mRNAs are generally primary transcripts in that they do not require post-transcriptional processing. Eukaryotic mRNA is synthesized in the nucleus and must be exported to the cytoplasm for translation. Most eukaryotic mRNAs have a sequence of polyadenylic acid at the 3' end, referred to as the poly(A) tail. The function of this tail is not known for certain, but it may play a role in the export of mature mRNA from the nucleus as well as in helping stabilize some mRNA molecules by retarding their degradation in the cytoplasm. Messenger RNA,Messenger RNA, Polyadenylated,Poly(A) Tail,Poly(A)+ RNA,Poly(A)+ mRNA,RNA, Messenger, Polyadenylated,RNA, Polyadenylated,mRNA,mRNA, Non-Polyadenylated,mRNA, Polyadenylated,Non-Polyadenylated mRNA,Poly(A) RNA,Polyadenylated mRNA,Non Polyadenylated mRNA,Polyadenylated Messenger RNA,Polyadenylated RNA,RNA, Polyadenylated Messenger,mRNA, Non Polyadenylated
D012694 Serine A non-essential amino acid occurring in natural form as the L-isomer. It is synthesized from GLYCINE or THREONINE. It is involved in the biosynthesis of PURINES; PYRIMIDINES; and other amino acids. L-Serine,L Serine

Related Publications

Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
September 1995, RNA (New York, N.Y.),
Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
August 2023, Acta pharmaceutica Sinica. B,
Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
June 2010, Genes & development,
Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
September 2023, Drug discovery today,
Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
January 2020, International journal of biological sciences,
Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
June 2005, Biochemical Society transactions,
Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
August 2014, Seminars in cell & developmental biology,
Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
April 2012, Fibrogenesis & tissue repair,
Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
January 2006, Nucleic acids research,
Sophie Califice, and Denis Baurain, and Marc Hanikenne, and Patrick Motte
March 1999, Proceedings of the National Academy of Sciences of the United States of America,
Copied contents to your clipboard!