Recovering probabilities for nucleotide trimming processes for T cell receptor TRA and TRG V-J junctions analyzed with IMGT tools. 2008

Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
Institut Curie, Centre de Recherche, Paris, F-75248, France. bleakley@math.univ-montp2.fr

BACKGROUND Nucleotides are trimmed from the ends of variable (V), diversity (D) and joining (J) genes during immunoglobulin (IG) and T cell receptor (TR) rearrangements in B cells and T cells of the immune system. This trimming is followed by addition of nucleotides at random, forming the N regions (N for nucleotides) of the V-J and V-D-J junctions. These processes are crucial for creating diversity in the immune response since the number of trimmed nucleotides and the number of added nucleotides vary in each B or T cell. IMGT sequence analysis tools, IMGT/V-QUEST and IMGT/JunctionAnalysis, are able to provide detailed and accurate analysis of the final observed junction nucleotide sequences (tool "output"). However, as trimmed nucleotides can potentially be replaced by identical N region nucleotides during the process, the observed "output" represents a biased estimate of the "true trimming process." RESULTS A probabilistic approach based on an analysis of the standardized tool "output" is proposed to infer the probability distribution of the "true trimmming process" and to provide plausible biological hypotheses explaining this process. We collated a benchmark dataset of TR alpha (TRA) and TR gamma (TRG) V-J rearranged sequences and junctions analysed with IMGT/V-QUEST and IMGT/JunctionAnalysis, the nucleotide sequence analysis tools from IMGT, the international ImMunoGeneTics information system, http://imgt.cines.fr. The standardized description of the tool output is based on the IMGT-ONTOLOGY axioms and concepts. We propose a simple first-order model that attempts to transform the observed "output" probability distribution into an estimate closer to the "true trimming process" probability distribution. We use this estimate to test the hypothesis that Poisson processes are involved in trimming. This hypothesis was not rejected at standard confidence levels for three of the four trimming processes: TRAV, TRAJ and TRGV. CONCLUSIONS By using trimming of rearranged TR genes as a benchmark, we show that a probabilistic approach, applied to IMGT standardized tool "outputs" opens the way to plausible hypotheses on the events involved in the "true trimming process" and eventually to an exact quantification of trimming itself. With increasing high-throughput of standardized immunogenetics data, similar probabilistic approaches will improve understanding of processes so far only characterized by the "output" of standardized tools.

UI MeSH Term Description Entries
D007125 Immunogenetics A subdiscipline of genetics which deals with the genetic basis of the immune response (IMMUNITY). Immunogenetic
D007136 Immunoglobulins Multi-subunit proteins which function in IMMUNITY. They are produced by B LYMPHOCYTES from the IMMUNOGLOBULIN GENES. They are comprised of two heavy (IMMUNOGLOBULIN HEAVY CHAINS) and two light chains (IMMUNOGLOBULIN LIGHT CHAINS) with additional ancillary polypeptide chains depending on their isoforms. The variety of isoforms include monomeric or polymeric forms, and transmembrane forms (B-CELL ANTIGEN RECEPTORS) or secreted forms (ANTIBODIES). They are divided by the amino acid sequence of their heavy chains into five classes (IMMUNOGLOBULIN A; IMMUNOGLOBULIN D; IMMUNOGLOBULIN E; IMMUNOGLOBULIN G; IMMUNOGLOBULIN M) and various subclasses. Globulins, Immune,Immune Globulin,Immune Globulins,Immunoglobulin,Globulin, Immune
D008957 Models, Genetic Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment. Genetic Models,Genetic Model,Model, Genetic
D009711 Nucleotides The monomeric units from which DNA or RNA polymers are constructed. They consist of a purine or pyrimidine base, a pentose sugar, and a phosphate group. (From King & Stansfield, A Dictionary of Genetics, 4th ed) Nucleotide
D011336 Probability The study of chance processes or the relative frequency characterizing a chance process. Probabilities
D011948 Receptors, Antigen, T-Cell Molecules on the surface of T-lymphocytes that recognize and combine with antigens. The receptors are non-covalently associated with a complex of several polypeptides collectively called CD3 antigens (CD3 COMPLEX). Recognition of foreign antigen and the major histocompatibility complex is accomplished by a single heterodimeric antigen-receptor structure, composed of either alpha-beta (RECEPTORS, ANTIGEN, T-CELL, ALPHA-BETA) or gamma-delta (RECEPTORS, ANTIGEN, T-CELL, GAMMA-DELTA) chains. Antigen Receptors, T-Cell,T-Cell Receptors,Receptors, T-Cell Antigen,T-Cell Antigen Receptor,T-Cell Receptor,Antigen Receptor, T-Cell,Antigen Receptors, T Cell,Receptor, T-Cell,Receptor, T-Cell Antigen,Receptors, T Cell Antigen,Receptors, T-Cell,T Cell Antigen Receptor,T Cell Receptor,T Cell Receptors,T-Cell Antigen Receptors
D005803 Genes, Immunoglobulin Genes encoding the different subunits of the IMMUNOGLOBULINS, for example the IMMUNOGLOBULIN LIGHT CHAIN GENES and the IMMUNOGLOBULIN HEAVY CHAIN GENES. The heavy and light immunoglobulin genes are present as gene segments in the germline cells. The completed genes are created when the segments are shuffled and assembled (B-LYMPHOCYTE GENE REARRANGEMENT) during B-LYMPHOCYTE maturation. The gene segments of the human light and heavy chain germline genes are symbolized V (variable), J (joining) and C (constant). The heavy chain germline genes have an additional segment D (diversity). Genes, Ig,Immunoglobulin Genes,Gene, Ig,Gene, Immunoglobulin,Ig Gene,Ig Genes,Immunoglobulin Gene
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D001483 Base Sequence The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence. DNA Sequence,Nucleotide Sequence,RNA Sequence,DNA Sequences,Base Sequences,Nucleotide Sequences,RNA Sequences,Sequence, Base,Sequence, DNA,Sequence, Nucleotide,Sequence, RNA,Sequences, Base,Sequences, DNA,Sequences, Nucleotide,Sequences, RNA
D015322 Gene Rearrangement, B-Lymphocyte Ordered rearrangement of B-lymphocyte variable gene regions coding for the IMMUNOGLOBULIN CHAINS, thereby contributing to antibody diversity. It occurs during the differentiation of the IMMATURE B-LYMPHOCYTES. B-Cell Gene Rearrangement,B-Lymphocyte Gene Rearrangement,Gene Rearrangement, B-Cell,B Cell Gene Rearrangement,B Lymphocyte Gene Rearrangement,B-Cell Gene Rearrangements,B-Lymphocyte Gene Rearrangements,Gene Rearrangement, B Cell,Gene Rearrangement, B Lymphocyte,Gene Rearrangements, B-Cell,Gene Rearrangements, B-Lymphocyte,Rearrangement, B-Cell Gene,Rearrangement, B-Lymphocyte Gene,Rearrangements, B-Cell Gene,Rearrangements, B-Lymphocyte Gene

Related Publications

Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
August 2004, Bioinformatics (Oxford, England),
Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
June 2011, Cold Spring Harbor protocols,
Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
July 2004, Nucleic acids research,
Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
April 2006, BMC bioinformatics,
Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
June 2011, Cold Spring Harbor protocols,
Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
January 2006, Nucleic acids research,
Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
February 2010, PLoS computational biology,
Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
January 1992, Molecular immunology,
Kevin Bleakley, and Marie-Paule Lefranc, and Gérard Biau
January 2006, Cytogenetic and genome research,
Copied contents to your clipboard!