Detection of protein similarities using nucleotide sequence databases. 1988

S Henikoff, and J C Wallace
Fred Hutchinson Cancer Research Center, Seattle, WA 98104.

A simple procedure is described for finding similarities between proteins using nucleotide sequence databases. The approach is illustrated by several examples of previously unknown correspondences with important biological implications: Drosophila elongation factor Tu is shown to be encoded by two genes that are differently expressed during development; a cluster of three Drosophila genes likely encode maltases; a flesh-fly fat body protein resembles the hypothesized Drosophila alcohol dehydrogenase ancestral protein; an unknown protein encoded at the multifunctional E. coli hisT locus resembles aspartate beta-semialdehyde dehydrogenase; and the E. coli tyrR protein is related to nitrogen regulatory proteins. These and other matches were discovered using a personal computer of the type available in most laboratories collecting DNA sequence data. As relatively few sequences were sampled to find these matches, it is likely that much of the existing data has not been adequately examined.

UI MeSH Term Description Entries
D007256 Information Systems Integrated set of files, procedures, and equipment for the storage, manipulation, and retrieval of information. Ancillary Information Systems,Emergency Care Information Systems,Information Retrieval Systems,Perinatal Information System,Ancillary Information System,Information Retrieval System,Information System,Information System, Ancillary,Information System, Perinatal,Perinatal Information Systems,Systems, Information Retrieval
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D004331 Drosophila melanogaster A species of fruit fly frequently used in genetics because of the large size of its chromosomes. D. melanogaster,Drosophila melanogasters,melanogaster, Drosophila
D004926 Escherichia coli A species of gram-negative, facultatively anaerobic, rod-shaped bacteria (GRAM-NEGATIVE FACULTATIVELY ANAEROBIC RODS) commonly found in the lower part of the intestine of warm-blooded animals. It is usually nonpathogenic, but some strains are known to produce DIARRHEA and pyogenic infections. Pathogenic strains (virotypes) are classified by their specific pathogenic mechanisms such as toxins (ENTEROTOXIGENIC ESCHERICHIA COLI), etc. Alkalescens-Dispar Group,Bacillus coli,Bacterium coli,Bacterium coli commune,Diffusely Adherent Escherichia coli,E coli,EAggEC,Enteroaggregative Escherichia coli,Enterococcus coli,Diffusely Adherent E. coli,Enteroaggregative E. coli,Enteroinvasive E. coli,Enteroinvasive Escherichia coli
D000426 Alcohol Dehydrogenase A zinc-containing enzyme which oxidizes primary and secondary alcohols or hemiacetals in the presence of NAD. In alcoholic fermentation, it catalyzes the final step of reducing an aldehyde to an alcohol in the presence of NADH and hydrogen. Alcohol Dehydrogenase (NAD+),Alcohol Dehydrogenase I,Alcohol Dehydrogenase II,Alcohol-NAD+ Oxidoreductase,Yeast Alcohol Dehydrogenase,Alcohol Dehydrogenase, Yeast,Alcohol NAD+ Oxidoreductase,Dehydrogenase, Alcohol,Dehydrogenase, Yeast Alcohol,Oxidoreductase, Alcohol-NAD+
D000520 alpha-Glucosidases Enzymes that catalyze the exohydrolysis of 1,4-alpha-glucosidic linkages with release of alpha-glucose. Deficiency of alpha-1,4-glucosidase may cause GLYCOGEN STORAGE DISEASE TYPE II. Acid Maltase,Lysosomal alpha-Glucosidase,Maltase,Maltases,Maltase-Glucoamylase,Neutral Maltase,Neutral alpha-Glucosidase,alpha-Glucosidase,Lysosomal alpha Glucosidase,Maltase Glucoamylase,Neutral alpha Glucosidase,alpha Glucosidase,alpha Glucosidases,alpha-Glucosidase, Lysosomal,alpha-Glucosidase, Neutral
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D000818 Animals Unicellular or multicellular, heterotrophic organisms, that have sensation and the power of voluntary movement. Under the older five kingdom paradigm, Animalia was one of the kingdoms. Under the modern three domain model, Animalia represents one of the many groups in the domain EUKARYOTA. Animal,Metazoa,Animalia
D001426 Bacterial Proteins Proteins found in any species of bacterium. Bacterial Gene Products,Bacterial Gene Proteins,Gene Products, Bacterial,Bacterial Gene Product,Bacterial Gene Protein,Bacterial Protein,Gene Product, Bacterial,Gene Protein, Bacterial,Gene Proteins, Bacterial,Protein, Bacterial,Proteins, Bacterial

Related Publications

S Henikoff, and J C Wallace
January 2010, Methods in molecular biology (Clifton, N.J.),
S Henikoff, and J C Wallace
January 2000, Advances in protein chemistry,
S Henikoff, and J C Wallace
February 2004, Current opinion in chemical biology,
S Henikoff, and J C Wallace
April 2008, Biochimie,
S Henikoff, and J C Wallace
August 1994, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme,
S Henikoff, and J C Wallace
January 2001, Trends in biochemical sciences,
S Henikoff, and J C Wallace
October 1995, Computer applications in the biosciences : CABIOS,
S Henikoff, and J C Wallace
January 1990, Methods in enzymology,
Copied contents to your clipboard!