Recognition of regulatory regions in genomic sequences. 1994

E Wingender
Department of Genetics, Gesellschaft für Biotechnologische Forschung, Braunschweig, Germany.

For the functional interpretation of genomic sequences, effective algorithms have to be developed that will recognize regions of specific function and thus will suggest experiments for their verification. As a first step, relevant data have to be collected in an appropriate database from which suitable training sets can be extracted. In this paper, I discuss the requirements for a database that collects information about regulatory DNA sequences and describe the structure and contents of such a database (TRANSFAC). This compiled information will serve as a basis for comprehensive analysis of sites that regulate transcription, e.g., by statistical methods. It will thus facilitate the recognition of regulatory genomic sequence information and the assignment of the corresponding regulators. Moreover, it will provide all relevant data about the regulating proteins which will allow to trace back transcriptional control cascades to their origin.

UI MeSH Term Description Entries
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D011401 Promoter Regions, Genetic DNA sequences which are recognized (directly or indirectly) and bound by a DNA-dependent RNA polymerase during the initiation of transcription. Highly conserved sequences within the promoter include the Pribnow box in bacteria and the TATA BOX in eukaryotes. rRNA Promoter,Early Promoters, Genetic,Late Promoters, Genetic,Middle Promoters, Genetic,Promoter Regions,Promoter, Genetic,Promotor Regions,Promotor, Genetic,Pseudopromoter, Genetic,Early Promoter, Genetic,Genetic Late Promoter,Genetic Middle Promoters,Genetic Promoter,Genetic Promoter Region,Genetic Promoter Regions,Genetic Promoters,Genetic Promotor,Genetic Promotors,Genetic Pseudopromoter,Genetic Pseudopromoters,Late Promoter, Genetic,Middle Promoter, Genetic,Promoter Region,Promoter Region, Genetic,Promoter, Genetic Early,Promoter, rRNA,Promoters, Genetic,Promoters, Genetic Middle,Promoters, rRNA,Promotor Region,Promotors, Genetic,Pseudopromoters, Genetic,Region, Genetic Promoter,Region, Promoter,Region, Promotor,Regions, Genetic Promoter,Regions, Promoter,Regions, Promotor,rRNA Promoters
D011498 Protein Precursors Precursors, Protein
D004247 DNA A deoxyribonucleotide polymer that is the primary genetic material of all cells. Eukaryotic and prokaryotic organisms normally contain DNA in a double-stranded state, yet several important biological processes transiently involve single-stranded regions. DNA, which consists of a polysugar-phosphate backbone possessing projections of purines (adenine and guanine) and pyrimidines (thymine and cytosine), forms a double helix that is held together by hydrogen bonds between these purines and pyrimidines (adenine to thymine and guanine to cytosine). DNA, Double-Stranded,Deoxyribonucleic Acid,ds-DNA,DNA, Double Stranded,Double-Stranded DNA,ds DNA
D004745 Enkephalins One of the three major families of endogenous opioid peptides. The enkephalins are pentapeptides that are widespread in the central and peripheral nervous systems and in the adrenal medulla. Enkephalin
D005809 Genes, Regulator Genes which regulate or circumscribe the activity of other genes; specifically, genes which code for PROTEINS or RNAs which have GENE EXPRESSION REGULATION functions. Gene, Regulator,Regulator Gene,Regulator Genes,Regulatory Genes,Gene, Regulatory,Genes, Regulatory,Regulatory Gene
D005821 Genetic Techniques Chromosomal, biochemical, intracellular, and other methods used in the study of genetics. Genetic Technic,Genetic Technics,Genetic Technique,Technic, Genetic,Technics, Genetic,Technique, Genetic,Techniques, Genetic
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein

Related Publications

E Wingender
January 1999, Bioinformatics (Oxford, England),
E Wingender
January 1990, Methods in enzymology,
E Wingender
July 1994, Proceedings of the National Academy of Sciences of the United States of America,
E Wingender
January 2008, Methods in cell biology,
E Wingender
September 2009, Virus research,
E Wingender
September 1982, Nucleic acids research,
E Wingender
October 1996, Computer applications in the biosciences : CABIOS,
E Wingender
February 2001, Nature reviews. Genetics,
E Wingender
January 1999, Research in microbiology,
Copied contents to your clipboard!