Local alignment of two-base encoded DNA sequence. 2009

Nils Homer, and Barry Merriman, and Stanley F Nelson
Department of Computer Science, University of California Los Angeles, Los Angeles, California 90095, USA. nhomer@cs.ucla.edu

BACKGROUND DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. RESULTS We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. CONCLUSIONS The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data.

UI MeSH Term Description Entries
D004247 DNA A deoxyribonucleotide polymer that is the primary genetic material of all cells. Eukaryotic and prokaryotic organisms normally contain DNA in a double-stranded state, yet several important biological processes transiently involve single-stranded regions. DNA, which consists of a polysugar-phosphate backbone possessing projections of purines (adenine and guanine) and pyrimidines (thymine and cytosine), forms a double helix that is held together by hydrogen bonds between these purines and pyrimidines (adenine to thymine and guanine to cytosine). DNA, Double-Stranded,Deoxyribonucleic Acid,ds-DNA,DNA, Double Stranded,Double-Stranded DNA,ds DNA
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D001483 Base Sequence The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence. DNA Sequence,Nucleotide Sequence,RNA Sequence,DNA Sequences,Base Sequences,Nucleotide Sequences,RNA Sequences,Sequence, Base,Sequence, DNA,Sequence, Nucleotide,Sequence, RNA,Sequences, Base,Sequences, DNA,Sequences, Nucleotide,Sequences, RNA
D016415 Sequence Alignment The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms. Sequence Homology Determination,Determination, Sequence Homology,Alignment, Sequence,Alignments, Sequence,Determinations, Sequence Homology,Sequence Alignments,Sequence Homology Determinations
D017422 Sequence Analysis, DNA A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis. DNA Sequence Analysis,Sequence Determination, DNA,Analysis, DNA Sequence,DNA Sequence Determination,DNA Sequence Determinations,DNA Sequencing,Determination, DNA Sequence,Determinations, DNA Sequence,Sequence Determinations, DNA,Analyses, DNA Sequence,DNA Sequence Analyses,Sequence Analyses, DNA,Sequencing, DNA

Related Publications

Nils Homer, and Barry Merriman, and Stanley F Nelson
July 1992, Bulletin of mathematical biology,
Nils Homer, and Barry Merriman, and Stanley F Nelson
January 2021, IEEE open journal of engineering in medicine and biology,
Nils Homer, and Barry Merriman, and Stanley F Nelson
May 2025, Nature communications,
Nils Homer, and Barry Merriman, and Stanley F Nelson
December 1990, Journal of biomolecular structure & dynamics,
Nils Homer, and Barry Merriman, and Stanley F Nelson
December 1987, Journal of biomolecular structure & dynamics,
Nils Homer, and Barry Merriman, and Stanley F Nelson
March 1986, The Journal of biological chemistry,
Nils Homer, and Barry Merriman, and Stanley F Nelson
June 2009, IEEE transactions on nanobioscience,
Nils Homer, and Barry Merriman, and Stanley F Nelson
January 2008, Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference,
Nils Homer, and Barry Merriman, and Stanley F Nelson
March 2002, Nucleic acids research,
Nils Homer, and Barry Merriman, and Stanley F Nelson
May 2009, Bioinformatics (Oxford, England),
Copied contents to your clipboard!