Computational analysis of bacterial RNA-Seq data. 2013

Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
Department of Microbiology, Boston University School of Medicine, Boston, MA 02118, USA.

Recent advances in high-throughput RNA sequencing (RNA-seq) have enabled tremendous leaps forward in our understanding of bacterial transcriptomes. However, computational methods for analysis of bacterial transcriptome data have not kept pace with the large and growing data sets generated by RNA-seq technology. Here, we present new algorithms, specific to bacterial gene structures and transcriptomes, for analysis of RNA-seq data. The algorithms are implemented in an open source software system called Rockhopper that supports various stages of bacterial RNA-seq data analysis, including aligning sequencing reads to a genome, constructing transcriptome maps, quantifying transcript abundance, testing for differential gene expression, determining operon structures and visualizing results. We demonstrate the performance of Rockhopper using 2.1 billion sequenced reads from 75 RNA-seq experiments conducted with Escherichia coli, Neisseria gonorrhoeae, Salmonella enterica, Streptococcus pyogenes and Xenorhabdus nematophila. We find that the transcriptome maps generated by our algorithms are highly accurate when compared with focused experimental data from E. coli and N. gonorrhoeae, and we validate our system's ability to identify novel small RNAs, operons and transcription start sites. Our results suggest that Rockhopper can be used for efficient and accurate analysis of bacterial RNA-seq data, and that it can aid with elucidation of bacterial transcriptomes.

UI MeSH Term Description Entries
D009876 Operon In bacteria, a group of metabolically related genes, with a common promoter, whose transcription into a single polycistronic MESSENGER RNA is under the control of an OPERATOR REGION. Operons
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D012329 RNA, Bacterial Ribonucleic acid in bacteria having regulatory and catalytic roles as well as involvement in protein synthesis. Bacterial RNA
D012984 Software Sequential operating programs and data which instruct the functioning of a digital computer. Computer Programs,Computer Software,Open Source Software,Software Engineering,Software Tools,Computer Applications Software,Computer Programs and Programming,Computer Software Applications,Application, Computer Software,Applications Software, Computer,Applications Softwares, Computer,Applications, Computer Software,Computer Applications Softwares,Computer Program,Computer Software Application,Engineering, Software,Open Source Softwares,Program, Computer,Programs, Computer,Software Application, Computer,Software Applications, Computer,Software Tool,Software, Computer,Software, Computer Applications,Software, Open Source,Softwares, Computer Applications,Softwares, Open Source,Source Software, Open,Source Softwares, Open,Tool, Software,Tools, Software
D014158 Transcription, Genetic The biosynthesis of RNA carried out on a template of DNA. The biosynthesis of DNA from an RNA template is called REVERSE TRANSCRIPTION. Genetic Transcription
D016415 Sequence Alignment The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms. Sequence Homology Determination,Determination, Sequence Homology,Alignment, Sequence,Alignments, Sequence,Determinations, Sequence Homology,Sequence Alignments,Sequence Homology Determinations
D016680 Genome, Bacterial The genetic complement of a BACTERIA as represented in its DNA. Bacterial Genome,Bacterial Genomes,Genomes, Bacterial
D017423 Sequence Analysis, RNA A multistage process that includes cloning, physical mapping, subcloning, sequencing, and information analysis of an RNA SEQUENCE. RNA Sequence Analysis,Sequence Determination, RNA,Analysis, RNA Sequence,Determination, RNA Sequence,Determinations, RNA Sequence,RNA Sequence Determination,RNA Sequence Determinations,RNA Sequencing,Sequence Determinations, RNA,Analyses, RNA Sequence,RNA Sequence Analyses,Sequence Analyses, RNA,Sequencing, RNA
D058727 RNA, Small Untranslated Short RNA, about 200 base pairs in length or shorter, that does not code for protein. Short Noncoding RNA,Small Non-Coding RNA,Small Non-Messenger RNA,Small Non-Protein-Coding RNA,Small Noncoding RNA,Small Untranslated RNA,sncRNA,sncRNAs,Non-Coding RNA, Small,Non-Messenger RNA, Small,Non-Protein-Coding RNA, Small,Noncoding RNA, Short,Noncoding RNA, Small,RNA, Short Noncoding,RNA, Small Non-Coding,RNA, Small Non-Messenger,RNA, Small Non-Protein-Coding,RNA, Small Noncoding,Small Non Coding RNA,Small Non Messenger RNA,Small Non Protein Coding RNA,Untranslated RNA, Small
D059014 High-Throughput Nucleotide Sequencing Techniques of nucleotide sequence analysis that increase the range, complexity, sensitivity, and accuracy of results by greatly increasing the scale of operations and thus the number of nucleotides, and the number of copies of each nucleotide sequenced. The sequencing may be done by analysis of the synthesis or ligation products, hybridization to preexisting sequences, etc. High-Throughput Sequencing,Illumina Sequencing,Ion Proton Sequencing,Ion Torrent Sequencing,Next-Generation Sequencing,Deep Sequencing,High-Throughput DNA Sequencing,High-Throughput RNA Sequencing,Massively-Parallel Sequencing,Pyrosequencing,DNA Sequencing, High-Throughput,High Throughput DNA Sequencing,High Throughput Nucleotide Sequencing,High Throughput RNA Sequencing,High Throughput Sequencing,Massively Parallel Sequencing,Next Generation Sequencing,Nucleotide Sequencing, High-Throughput,RNA Sequencing, High-Throughput,Sequencing, Deep,Sequencing, High-Throughput,Sequencing, High-Throughput DNA,Sequencing, High-Throughput Nucleotide,Sequencing, High-Throughput RNA,Sequencing, Illumina,Sequencing, Ion Proton,Sequencing, Ion Torrent,Sequencing, Massively-Parallel,Sequencing, Next-Generation

Related Publications

Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
January 2023, Methods in molecular biology (Clifton, N.J.),
Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
January 2021, Methods in molecular biology (Clifton, N.J.),
Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
January 2012, Methods in molecular biology (Clifton, N.J.),
Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
January 2021, Methods in enzymology,
Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
January 2019, Frontiers in genetics,
Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
January 2018, Methods in molecular biology (Clifton, N.J.),
Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
January 2010, Methods in molecular biology (Clifton, N.J.),
Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
April 2017, Methods (San Diego, Calif.),
Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
June 2019, Current protocols in bioinformatics,
Ryan McClure, and Divya Balasubramanian, and Yan Sun, and Maksym Bobrovskyy, and Paul Sumby, and Caroline A Genco, and Carin K Vanderpool, and Brian Tjaden
January 2018, Methods in molecular biology (Clifton, N.J.),
Copied contents to your clipboard!