Bit-parallel sequence-to-graph alignment. 2019

Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
Center for Bioinformatics, Saarland University, Saarland Informatics Campus E2.1, 66123 Saarbrücken, Germany.

Graphs are commonly used to represent sets of sequences. Either edges or nodes can be labeled by sequences, so that each path in the graph spells a concatenated sequence. Examples include graphs to represent genome assemblies, such as string graphs and de Bruijn graphs, and graphs to represent a pan-genome and hence the genetic variation present in a population. Being able to align sequencing reads to such graphs is a key step for many analyses and its applications include genome assembly, read error correction and variant calling with respect to a variation graph. We generalize two linear sequence-to-sequence algorithms to graphs: the Shift-And algorithm for exact matching and Myers' bitvector algorithm for semi-global alignment. These linear algorithms are both based on processing w sequence characters with a constant number of operations, where w is the word size of the machine (commonly 64), and achieve a speedup of up to w over naive algorithms. For a graph with |V| nodes and |E| edges and a sequence of length m, our bitvector-based graph alignment algorithm reaches a worst case runtime of O(|V|+⌈mw⌉|E| log w) for acyclic graphs and O(|V|+m|E| log w) for arbitrary cyclic graphs. We apply it to five different types of graphs and observe a speedup between 3-fold and 20-fold compared with a previous (asymptotically optimal) alignment algorithm. https://github.com/maickrau/GraphAligner. Supplementary data are available at Bioinformatics online.

UI MeSH Term Description Entries
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D016415 Sequence Alignment The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms. Sequence Homology Determination,Determination, Sequence Homology,Alignment, Sequence,Alignments, Sequence,Determinations, Sequence Homology,Sequence Alignments,Sequence Homology Determinations
D016678 Genome The genetic complement of an organism, including all of its GENES, as represented in its DNA, or in some cases, its RNA. Genomes
D017422 Sequence Analysis, DNA A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis. DNA Sequence Analysis,Sequence Determination, DNA,Analysis, DNA Sequence,DNA Sequence Determination,DNA Sequence Determinations,DNA Sequencing,Determination, DNA Sequence,Determinations, DNA Sequence,Sequence Determinations, DNA,Analyses, DNA Sequence,DNA Sequence Analyses,Sequence Analyses, DNA,Sequencing, DNA

Related Publications

Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
November 2014, Bioinformatics (Oxford, England),
Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
August 2012, Journal of bioinformatics and computational biology,
Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
July 2019, Bioinformatics (Oxford, England),
Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
September 2020, Genome biology,
Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
July 2021, Bioinformatics (Oxford, England),
Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
January 1995, Proceedings. International Conference on Intelligent Systems for Molecular Biology,
Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
June 2009, IEEE transactions on nanobioscience,
Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
June 1993, Computer applications in the biosciences : CABIOS,
Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
December 1996, Computer applications in the biosciences : CABIOS,
Mikko Rautiainen, and Veli Mäkinen, and Tobias Marschall
January 2014, BioMed research international,
Copied contents to your clipboard!