Space-conserving optimal DNA-protein alignment. 2004

Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
Department of Electrical and Computer Engineering, Iowa State University, USA. kopang@iastate.edu.

DNA-protein alignment algorithms can be used to discover coding sequences in a genomic sequence, if the corresponding protein derivatives are known. They can also be used to identify potential coding sequences of a newly sequenced genome, by using proteins from related species. Previously known algorithms either solve a simplified formulation, or sacrifice optimality to achieve practical implementation. In this paper, we present a comprehensive formulation of the DNA-protein alignment problem, and an algorithm to compute the optimal alignment in O(mn) time using only four tables of size (m + 1) x (n + 1), where m and n are the lengths of the DNA and protein sequences, respectively. We also developed a Protein and DNA Alignment program PanDA that implements the proposed solution. Experimental results indicate that our algorithm produces high quality alignments.

UI MeSH Term Description Entries
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D004247 DNA A deoxyribonucleotide polymer that is the primary genetic material of all cells. Eukaryotic and prokaryotic organisms normally contain DNA in a double-stranded state, yet several important biological processes transiently involve single-stranded regions. DNA, which consists of a polysugar-phosphate backbone possessing projections of purines (adenine and guanine) and pyrimidines (thymine and cytosine), forms a double helix that is held together by hydrogen bonds between these purines and pyrimidines (adenine to thymine and guanine to cytosine). DNA, Double-Stranded,Deoxyribonucleic Acid,ds-DNA,DNA, Double Stranded,Double-Stranded DNA,ds DNA
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D001483 Base Sequence The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence. DNA Sequence,Nucleotide Sequence,RNA Sequence,DNA Sequences,Base Sequences,Nucleotide Sequences,RNA Sequences,Sequence, Base,Sequence, DNA,Sequence, Nucleotide,Sequence, RNA,Sequences, Base,Sequences, DNA,Sequences, Nucleotide,Sequences, RNA
D016415 Sequence Alignment The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms. Sequence Homology Determination,Determination, Sequence Homology,Alignment, Sequence,Alignments, Sequence,Determinations, Sequence Homology,Sequence Alignments,Sequence Homology Determinations
D017422 Sequence Analysis, DNA A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis. DNA Sequence Analysis,Sequence Determination, DNA,Analysis, DNA Sequence,DNA Sequence Determination,DNA Sequence Determinations,DNA Sequencing,Determination, DNA Sequence,Determinations, DNA Sequence,Sequence Determinations, DNA,Analyses, DNA Sequence,DNA Sequence Analyses,Sequence Analyses, DNA,Sequencing, DNA
D044962 Data Compression Information application based on a variety of coding methods to minimize the amount of data to be stored, retrieved, or transmitted. Data compression can be applied to various forms of data, such as images and signals. It is used to reduce costs and increase efficiency in the maintenance of large volumes of data. Image Compression,Compression, Data,Compression, Image
D020539 Sequence Analysis, Protein A process that includes the determination of AMINO ACID SEQUENCE of a protein (or peptide, oligopeptide or peptide fragment) and the information analysis of the sequence. Amino Acid Sequence Analysis,Peptide Sequence Analysis,Protein Sequence Analysis,Sequence Determination, Protein,Amino Acid Sequence Analyses,Amino Acid Sequence Determination,Amino Acid Sequence Determinations,Amino Acid Sequencing,Peptide Sequence Determination,Protein Sequencing,Sequence Analyses, Amino Acid,Sequence Analysis, Amino Acid,Sequence Analysis, Peptide,Sequence Determination, Amino Acid,Sequence Determinations, Amino Acid,Acid Sequencing, Amino,Analyses, Peptide Sequence,Analyses, Protein Sequence,Analysis, Peptide Sequence,Analysis, Protein Sequence,Peptide Sequence Analyses,Peptide Sequence Determinations,Protein Sequence Analyses,Protein Sequence Determination,Protein Sequence Determinations,Sequence Analyses, Peptide,Sequence Analyses, Protein,Sequence Determination, Peptide,Sequence Determinations, Peptide,Sequence Determinations, Protein,Sequencing, Amino Acid,Sequencing, Protein

Related Publications

Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
February 2023, Bioinformatics (Oxford, England),
Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
January 2013, IEEE/ACM transactions on computational biology and bioinformatics,
Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
November 2009, Bioinformatics (Oxford, England),
Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
October 2008, Bioinformatics (Oxford, England),
Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
May 2023, Sensors (Basel, Switzerland),
Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
January 1996, Methods in enzymology,
Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
January 2007, Bioinformatics (Oxford, England),
Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
September 2010, Bioinformatics (Oxford, England),
Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
January 1984, Nucleic acids research,
Pang Ko, and Mahesh Narayanan, and Anantharaman Kalyanaraman, and Srinivas Aluru
January 1991, Computer applications in the biosciences : CABIOS,
Copied contents to your clipboard!