iMGEins: detecting novel mobile genetic elements inserted in individual genomes. 2018

Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
Department of Electronics and Computer Engineering, Hanyang University, Seoul, Korea.

BACKGROUND Recent advances in sequencing technology have allowed us to investigate personal genomes to find structural variations, which have been studied extensively to identify their association with the physiology of diseases such as cancer. In particular, mobile genetic elements (MGEs) are one of the major constituents of the human genomes, and cause genome instability by insertion, mutation, and rearrangement. RESULTS We have developed a new program, iMGEins, to identify such novel MGEs by using sequencing reads of individual genomes, and to explore the breakpoints with the supporting reads and MGEs detected. iMGEins is the first MGE detection program that integrates three algorithmic components: discordant read-pair mapping, split-read mapping, and insertion sequence assembly. Our evaluation results showed its outstanding performance in detecting novel MGEs from simulated genomes, as well as real personal genomes. In detail, the average recall and precision rates of iMGEins are 96.67 and 100%, respectively, which are the highest among the programs compared. In the testing with real human genomes of the NA12878 sample, iMGEins shows the highest accuracy in detecting MGEs within 20 bp proximity of the breakpoints annotated. CONCLUSIONS In order to study the dynamics of MGEs in individual genomes, iMGEins was developed to accurately detect breakpoints and report inserted MGEs. Compared with other programs, iMGEins has valuable features of identifying novel MGEs and assembling the MGEs inserted.

UI MeSH Term Description Entries
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D012984 Software Sequential operating programs and data which instruct the functioning of a digital computer. Computer Programs,Computer Software,Open Source Software,Software Engineering,Software Tools,Computer Applications Software,Computer Programs and Programming,Computer Software Applications,Application, Computer Software,Applications Software, Computer,Applications Softwares, Computer,Applications, Computer Software,Computer Applications Softwares,Computer Program,Computer Software Application,Engineering, Software,Open Source Softwares,Program, Computer,Programs, Computer,Software Application, Computer,Software Applications, Computer,Software Tool,Software, Computer,Software, Computer Applications,Software, Open Source,Softwares, Computer Applications,Softwares, Open Source,Source Software, Open,Source Softwares, Open,Tool, Software,Tools, Software
D015894 Genome, Human The complete genetic complement contained in the DNA of a set of CHROMOSOMES in a HUMAN. The length of the human genome is about 3 billion base pairs. Human Genome,Genomes, Human,Human Genomes
D017422 Sequence Analysis, DNA A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis. DNA Sequence Analysis,Sequence Determination, DNA,Analysis, DNA Sequence,DNA Sequence Determination,DNA Sequence Determinations,DNA Sequencing,Determination, DNA Sequence,Determinations, DNA Sequence,Sequence Determinations, DNA,Analyses, DNA Sequence,DNA Sequence Analyses,Sequence Analyses, DNA,Sequencing, DNA
D059014 High-Throughput Nucleotide Sequencing Techniques of nucleotide sequence analysis that increase the range, complexity, sensitivity, and accuracy of results by greatly increasing the scale of operations and thus the number of nucleotides, and the number of copies of each nucleotide sequenced. The sequencing may be done by analysis of the synthesis or ligation products, hybridization to preexisting sequences, etc. High-Throughput Sequencing,Illumina Sequencing,Ion Proton Sequencing,Ion Torrent Sequencing,Next-Generation Sequencing,Deep Sequencing,High-Throughput DNA Sequencing,High-Throughput RNA Sequencing,Massively-Parallel Sequencing,Pyrosequencing,DNA Sequencing, High-Throughput,High Throughput DNA Sequencing,High Throughput Nucleotide Sequencing,High Throughput RNA Sequencing,High Throughput Sequencing,Massively Parallel Sequencing,Next Generation Sequencing,Nucleotide Sequencing, High-Throughput,RNA Sequencing, High-Throughput,Sequencing, Deep,Sequencing, High-Throughput,Sequencing, High-Throughput DNA,Sequencing, High-Throughput Nucleotide,Sequencing, High-Throughput RNA,Sequencing, Illumina,Sequencing, Ion Proton,Sequencing, Ion Torrent,Sequencing, Massively-Parallel,Sequencing, Next-Generation
D020071 Interspersed Repetitive Sequences Copies of transposable elements interspersed throughout the genome, some of which are still active and often referred to as "jumping genes". There are two classes of interspersed repetitive elements. Class I elements (or RETROELEMENTS - such as retrotransposons, retroviruses, LONG INTERSPERSED NUCLEOTIDE ELEMENTS and SHORT INTERSPERSED NUCLEOTIDE ELEMENTS) transpose via reverse transcription of an RNA intermediate. Class II elements (or DNA TRANSPOSABLE ELEMENTS - such as transposons, Tn elements, insertion sequence elements and mobile gene cassettes of bacterial integrons) transpose directly from one site in the DNA to another. Dispersed Repetitive Sequences,Genes, Jumping,Interspersed Repetitive Elements,Jumping Genes,Mobile Genetic Elements,Repetitive Sequences, Dispersed,Repetitive Sequences, Interspersed,Elements, Mobile Genetic,Genetic Elements, Mobile,Dispersed Repetitive Sequence,Element, Interspersed Repetitive,Element, Mobile Genetic,Elements, Interspersed Repetitive,Gene, Jumping,Genetic Element, Mobile,Interspersed Repetitive Element,Interspersed Repetitive Sequence,Jumping Gene,Mobile Genetic Element,Repetitive Element, Interspersed,Repetitive Elements, Interspersed,Repetitive Sequence, Dispersed,Repetitive Sequence, Interspersed,Sequence, Dispersed Repetitive,Sequence, Interspersed Repetitive,Sequences, Dispersed Repetitive,Sequences, Interspersed Repetitive

Related Publications

Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
February 2003, Trends in parasitology,
Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
January 2002, FEMS microbiology letters,
Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
September 2020, Life (Basel, Switzerland),
Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
April 2022, Nucleic acids research,
Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
January 2018, Methods in molecular biology (Clifton, N.J.),
Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
May 2021, BMC bioinformatics,
Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
September 2008, Genome research,
Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
May 2021, Molecular systems biology,
Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
January 2014, Mobile genetic elements,
Junwoo Bae, and Kyeong Won Lee, and Mohammad Nazrul Islam, and Hyung-Soon Yim, and Heejin Park, and Mina Rho
September 1995, The European respiratory journal. Supplement,
Copied contents to your clipboard!