DNA sequence comparison considering both amino acid and nucleotide insertions/deletions because of evolution and experimental error. 1999

R Irie, and S Hiraoka, and N Kasahara, and K Nagai
Hitachi Ltd., Central Research Laboratory, Tokyo, Japan. r-irie@crl.hitachi.co.jp

Amino acid similarity often needs to be considered in DNA sequence comparison to elucidate gene functions. We propose a Smith-Waterman-like algorithm which considers amino acid similarity and insertions/deletions in sequences at the DNA level and at the protein level in a hybrid manner. The algorithm is applied to cDNA sequences of Oryza sativa and those of Arabidopsis thaliana. The results are compared with the results of application of NCBI's tblastx program (which compares the sequences in the BLAST manner after translation). It is shown that the present algorithm is very helpful in discovering nucleotide insertions/deletions originating from experimental errors as well as amino acid insertions/deletions due to evolutionary reasons.

UI MeSH Term Description Entries
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D010940 Plant Proteins Proteins found in plants (flowers, herbs, shrubs, trees, etc.). The concept does not include proteins found in vegetables for which PLANT PROTEINS, DIETARY is available. Plant Protein,Protein, Plant,Proteins, Plant
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D001483 Base Sequence The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence. DNA Sequence,Nucleotide Sequence,RNA Sequence,DNA Sequences,Base Sequences,Nucleotide Sequences,RNA Sequences,Sequence, Base,Sequence, DNA,Sequence, Nucleotide,Sequence, RNA,Sequences, Base,Sequences, DNA,Sequences, Nucleotide,Sequences, RNA
D012275 Oryza A genus of grass family (Poaceae) that include several rice species. Oryza sativa,Rice,Rices
D012689 Sequence Homology, Nucleic Acid The sequential correspondence of nucleotides in one nucleic acid molecule with those of another nucleic acid molecule. Sequence homology is an indication of the genetic relatedness of different organisms and gene function. Base Sequence Homology,Homologous Sequences, Nucleic Acid,Homologs, Nucleic Acid Sequence,Homology, Base Sequence,Homology, Nucleic Acid Sequence,Nucleic Acid Sequence Homologs,Nucleic Acid Sequence Homology,Sequence Homology, Base,Base Sequence Homologies,Homologies, Base Sequence,Sequence Homologies, Base
D016208 Databases, Factual Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references. Databanks, Factual,Data Banks, Factual,Data Bases, Factual,Data Bank, Factual,Data Base, Factual,Databank, Factual,Database, Factual,Factual Data Bank,Factual Data Banks,Factual Data Base,Factual Data Bases,Factual Databank,Factual Databanks,Factual Database,Factual Databases
D016415 Sequence Alignment The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms. Sequence Homology Determination,Determination, Sequence Homology,Alignment, Sequence,Alignments, Sequence,Determinations, Sequence Homology,Sequence Alignments,Sequence Homology Determinations
D017353 Gene Deletion A genetic rearrangement through loss of segments of DNA or RNA, bringing sequences which are normally separated into close proximity. This deletion may be detected using cytogenetic techniques and can also be inferred from the phenotype, indicating a deletion at one specific locus. Deletion, Gene,Deletions, Gene,Gene Deletions

Related Publications

R Irie, and S Hiraoka, and N Kasahara, and K Nagai
November 1994, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme,
R Irie, and S Hiraoka, and N Kasahara, and K Nagai
January 1967, Bulletin de la Societe de chimie biologique,
R Irie, and S Hiraoka, and N Kasahara, and K Nagai
May 2003, Plant physiology,
R Irie, and S Hiraoka, and N Kasahara, and K Nagai
April 1998, Immunological reviews,
R Irie, and S Hiraoka, and N Kasahara, and K Nagai
November 2022, Biotechnology advances,
R Irie, and S Hiraoka, and N Kasahara, and K Nagai
May 2008, PLoS computational biology,
R Irie, and S Hiraoka, and N Kasahara, and K Nagai
October 1991, Nucleic acids research,
R Irie, and S Hiraoka, and N Kasahara, and K Nagai
January 2007, Bioinformation,
R Irie, and S Hiraoka, and N Kasahara, and K Nagai
October 1983, Nucleic acids research,
R Irie, and S Hiraoka, and N Kasahara, and K Nagai
September 1984, Philosophical transactions of the Royal Society of London. Series B, Biological sciences,
Copied contents to your clipboard!