WMSA: a novel method for multiple sequence alignment of DNA sequences. 2022

Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China.

Multiple sequence alignment (MSA) is a fundamental problem in bioinformatics. The quality of alignment will affect downstream analysis. MAFFT has adopted the Fast Fourier Transform method for searching the homologous segments and using them as anchors to divide the sequences, then making alignment only on segments, which can save time and memory without overly reducing the sequence alignment quality. MAFFT becomes slow when the dataset is large. We made a software, WMSA, which uses the divide-and-conquer method to split the sequences into clusters, aligns those clusters into profiles with the center star strategy and then makes a progressive profile-profile alignment. The alignment is conducted by the compiled algorithms of MAFFT, K-Band with multithread parallelism. Our method can balance time, space and quality and performs better than MAFFT in test experiments on highly conserved datasets. Source code is freely available at https://github.com/malabz/WMSA/, which is implemented in C/C++ and supported on Linux, and datasets are available at https://github.com/malabz/WMSA-dataset. Supplementary data are available at Bioinformatics online.

UI MeSH Term Description Entries
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D001483 Base Sequence The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence. DNA Sequence,Nucleotide Sequence,RNA Sequence,DNA Sequences,Base Sequences,Nucleotide Sequences,RNA Sequences,Sequence, Base,Sequence, DNA,Sequence, Nucleotide,Sequence, RNA,Sequences, Base,Sequences, DNA,Sequences, Nucleotide,Sequences, RNA
D012984 Software Sequential operating programs and data which instruct the functioning of a digital computer. Computer Programs,Computer Software,Open Source Software,Software Engineering,Software Tools,Computer Applications Software,Computer Programs and Programming,Computer Software Applications,Application, Computer Software,Applications Software, Computer,Applications Softwares, Computer,Applications, Computer Software,Computer Applications Softwares,Computer Program,Computer Software Application,Engineering, Software,Open Source Softwares,Program, Computer,Programs, Computer,Software Application, Computer,Software Applications, Computer,Software Tool,Software, Computer,Software, Computer Applications,Software, Open Source,Softwares, Computer Applications,Softwares, Open Source,Source Software, Open,Source Softwares, Open,Tool, Software,Tools, Software
D016415 Sequence Alignment The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms. Sequence Homology Determination,Determination, Sequence Homology,Alignment, Sequence,Alignments, Sequence,Determinations, Sequence Homology,Sequence Alignments,Sequence Homology Determinations
D017422 Sequence Analysis, DNA A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis. DNA Sequence Analysis,Sequence Determination, DNA,Analysis, DNA Sequence,DNA Sequence Determination,DNA Sequence Determinations,DNA Sequencing,Determination, DNA Sequence,Determinations, DNA Sequence,Sequence Determinations, DNA,Analyses, DNA Sequence,DNA Sequence Analyses,Sequence Analyses, DNA,Sequencing, DNA

Related Publications

Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
January 1993, Bio Systems,
Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
January 2010, IEEE transactions on information technology in biomedicine : a publication of the IEEE Engineering in Medicine and Biology Society,
Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
November 2004, Genome research,
Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
September 2000, Journal of molecular biology,
Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
October 1989, Journal of molecular biology,
Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
June 1996, Gene,
Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
June 2010, Current protocols in bioinformatics,
Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
September 1996, Journal of molecular biology,
Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
November 2010, BMC bioinformatics,
Yanming Wei, and Quan Zou, and Furong Tang, and Liang Yu
September 2008, Journal of computational biology : a journal of computational molecular cell biology,
Copied contents to your clipboard!