Comparative statistics for DNA and protein sequences: single sequence analysis. 1985

S Karlin, and G Ghandour

Four categories of data representations are used to help interpret structures and similarities of nucleic acid and protein sequences. Statistical significance of the observed relationships revealed by these representations are assessed by a hierarchy of permutation procedures and by comparisons with theoretical random models. Applications are presented for various DNA sequences including papovaviruses, Epstein-Barr virus, mitochondrial genomes, and several globin and immunoglobulin genes.

UI MeSH Term Description Entries
D007136 Immunoglobulins Multi-subunit proteins which function in IMMUNITY. They are produced by B LYMPHOCYTES from the IMMUNOGLOBULIN GENES. They are comprised of two heavy (IMMUNOGLOBULIN HEAVY CHAINS) and two light chains (IMMUNOGLOBULIN LIGHT CHAINS) with additional ancillary polypeptide chains depending on their isoforms. The variety of isoforms include monomeric or polymeric forms, and transmembrane forms (B-CELL ANTIGEN RECEPTORS) or secreted forms (ANTIBODIES). They are divided by the amino acid sequence of their heavy chains into five classes (IMMUNOGLOBULIN A; IMMUNOGLOBULIN D; IMMUNOGLOBULIN E; IMMUNOGLOBULIN G; IMMUNOGLOBULIN M) and various subclasses. Globulins, Immune,Immune Globulin,Immune Globulins,Immunoglobulin,Globulin, Immune
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D012091 Repetitive Sequences, Nucleic Acid Sequences of DNA or RNA that occur in multiple copies. There are several types: INTERSPERSED REPETITIVE SEQUENCES are copies of transposable elements (DNA TRANSPOSABLE ELEMENTS or RETROELEMENTS) dispersed throughout the genome. TERMINAL REPEAT SEQUENCES flank both ends of another sequence, for example, the long terminal repeats (LTRs) on RETROVIRUSES. Variations may be direct repeats, those occurring in the same direction, or inverted repeats, those opposite to each other in direction. TANDEM REPEAT SEQUENCES are copies which lie adjacent to each other, direct or inverted (INVERTED REPEAT SEQUENCES). DNA Repetitious Region,Direct Repeat,Genes, Selfish,Nucleic Acid Repetitive Sequences,Repetitive Region,Selfish DNA,Selfish Genes,DNA, Selfish,Repetitious Region, DNA,Repetitive Sequence,DNA Repetitious Regions,DNAs, Selfish,Direct Repeats,Gene, Selfish,Repeat, Direct,Repeats, Direct,Repetitious Regions, DNA,Repetitive Regions,Repetitive Sequences,Selfish DNAs,Selfish Gene
D004247 DNA A deoxyribonucleotide polymer that is the primary genetic material of all cells. Eukaryotic and prokaryotic organisms normally contain DNA in a double-stranded state, yet several important biological processes transiently involve single-stranded regions. DNA, which consists of a polysugar-phosphate backbone possessing projections of purines (adenine and guanine) and pyrimidines (thymine and cytosine), forms a double helix that is held together by hydrogen bonds between these purines and pyrimidines (adenine to thymine and guanine to cytosine). DNA, Double-Stranded,Deoxyribonucleic Acid,ds-DNA,DNA, Double Stranded,Double-Stranded DNA,ds DNA
D004272 DNA, Mitochondrial Double-stranded DNA of MITOCHONDRIA. In eukaryotes, the mitochondrial GENOME is circular and codes for ribosomal RNAs, transfer RNAs, and about 10 proteins. Mitochondrial DNA,mtDNA
D004854 Herpesvirus 4, Human The type species of LYMPHOCRYPTOVIRUS, subfamily GAMMAHERPESVIRINAE, infecting B-cells in humans. It is thought to be the causative agent of INFECTIOUS MONONUCLEOSIS and is strongly associated with oral hairy leukoplakia (LEUKOPLAKIA, HAIRY;), BURKITT LYMPHOMA; and other malignancies. Burkitt Herpesvirus,Burkitt Lymphoma Virus,E-B Virus,EBV,Epstein-Barr Virus,Human Herpesvirus 4,Infectious Mononucleosis Virus,Burkitt's Lymphoma Virus,HHV-4,Herpesvirus 4 (gamma), Human,Burkitts Lymphoma Virus,E B Virus,E-B Viruses,Epstein Barr Virus,Herpesvirus, Burkitt,Infectious Mononucleosis Viruses,Lymphoma Virus, Burkitt,Mononucleosis Virus, Infectious,Mononucleosis Viruses, Infectious
D005914 Globins A superfamily of proteins containing the globin fold which is composed of 6-8 alpha helices arranged in a characterstic HEME enclosing structure. Globin
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D000818 Animals Unicellular or multicellular, heterotrophic organisms, that have sensation and the power of voluntary movement. Under the older five kingdom paradigm, Animalia was one of the kingdoms. Under the modern three domain model, Animalia represents one of the many groups in the domain EUKARYOTA. Animal,Metazoa,Animalia
D001483 Base Sequence The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence. DNA Sequence,Nucleotide Sequence,RNA Sequence,DNA Sequences,Base Sequences,Nucleotide Sequences,RNA Sequences,Sequence, Base,Sequence, DNA,Sequence, Nucleotide,Sequence, RNA,Sequences, Base,Sequences, DNA,Sequences, Nucleotide,Sequences, RNA

Related Publications

S Karlin, and G Ghandour
November 1999, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics,
S Karlin, and G Ghandour
January 2008, Transboundary and emerging diseases,
S Karlin, and G Ghandour
June 2005, Journal of bioinformatics and computational biology,
S Karlin, and G Ghandour
January 2004, Proceedings. IEEE Computational Systems Bioinformatics Conference,
S Karlin, and G Ghandour
February 1998, Methods (San Diego, Calif.),
S Karlin, and G Ghandour
April 1999, Proceedings of the National Academy of Sciences of the United States of America,
S Karlin, and G Ghandour
January 2002, Journal of chemical information and computer sciences,
S Karlin, and G Ghandour
June 1975, The Journal of biological chemistry,
S Karlin, and G Ghandour
July 1991, European journal of biochemistry,
Copied contents to your clipboard!