Accelerating metagenomic read classification on CUDA-enabled GPUs. 2017

Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
Institute of Computer Science, Johannes Gutenberg University Mainz, Staudingerweg 9, Mainz, 55435, Germany. rkobus@students.uni-mainz.de.

BACKGROUND Metagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification; i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes software tools for fast and accurate metagenomic read classification are urgently needed. RESULTS We present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (CLARK) method. Using the processing power of a single Titan X GPU, cuCLARK can reach classification speeds of up to 50 million reads per minute. Corresponding speedups for species- (genus-)level classification range between 3.2 and 6.6 (3.7 and 6.4) compared to multi-threaded CLARK executed on a 16-core Xeon CPU workstation. CONCLUSIONS cuCLARK can perform metagenomic read classification at superior speeds on CUDA-enabled GPUs. It is free software licensed under GPL and can be downloaded at https://github.com/funatiq/cuclark free of charge.

UI MeSH Term Description Entries
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D014584 User-Computer Interface The portion of an interactive computer program that issues messages to and receives commands from a user. Interface, User Computer,Virtual Systems,User Computer Interface,Interface, User-Computer,Interfaces, User Computer,Interfaces, User-Computer,System, Virtual,Systems, Virtual,User Computer Interfaces,User-Computer Interfaces,Virtual System
D017422 Sequence Analysis, DNA A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis. DNA Sequence Analysis,Sequence Determination, DNA,Analysis, DNA Sequence,DNA Sequence Determination,DNA Sequence Determinations,DNA Sequencing,Determination, DNA Sequence,Determinations, DNA Sequence,Sequence Determinations, DNA,Analyses, DNA Sequence,DNA Sequence Analyses,Sequence Analyses, DNA,Sequencing, DNA
D056186 Metagenomics The systematic study of the GENOMES of assemblages of organisms. Community Genomics,Environmental Genomics,Population Genomics,Genomics, Community,Genomics, Environmental,Genomics, Population
D059014 High-Throughput Nucleotide Sequencing Techniques of nucleotide sequence analysis that increase the range, complexity, sensitivity, and accuracy of results by greatly increasing the scale of operations and thus the number of nucleotides, and the number of copies of each nucleotide sequenced. The sequencing may be done by analysis of the synthesis or ligation products, hybridization to preexisting sequences, etc. High-Throughput Sequencing,Illumina Sequencing,Ion Proton Sequencing,Ion Torrent Sequencing,Next-Generation Sequencing,Deep Sequencing,High-Throughput DNA Sequencing,High-Throughput RNA Sequencing,Massively-Parallel Sequencing,Pyrosequencing,DNA Sequencing, High-Throughput,High Throughput DNA Sequencing,High Throughput Nucleotide Sequencing,High Throughput RNA Sequencing,High Throughput Sequencing,Massively Parallel Sequencing,Next Generation Sequencing,Nucleotide Sequencing, High-Throughput,RNA Sequencing, High-Throughput,Sequencing, Deep,Sequencing, High-Throughput,Sequencing, High-Throughput DNA,Sequencing, High-Throughput Nucleotide,Sequencing, High-Throughput RNA,Sequencing, Illumina,Sequencing, Ion Proton,Sequencing, Ion Torrent,Sequencing, Massively-Parallel,Sequencing, Next-Generation
D020407 Internet A loose confederation of computer communication networks around the world. The networks that make up the Internet are connected through several backbone networks. The Internet grew out of the US Government ARPAnet project and was designed to facilitate information exchange. World Wide Web,Cyber Space,Cyberspace,Web, World Wide,Wide Web, World

Related Publications

Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
January 2011, IEEE/ACM transactions on computational biology and bioinformatics,
Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
June 2023, ICS ... : proceedings of the ... ACM International Conference on Supercomputing. International Conference on Supercomputing,
Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
January 2023, Journal of biotechnology and biomedicine,
Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
September 2020, Micromachines,
Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
September 2016, BMC bioinformatics,
Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
January 2015, BioMed research international,
Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
January 2011, PloS one,
Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
March 2011, International journal of computer assisted radiology and surgery,
Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
April 2010, BMC research notes,
Robin Kobus, and Christian Hundt, and André Müller, and Bertil Schmidt
October 2008, Journal of parallel and distributed computing,
Copied contents to your clipboard!