Genome-wide sequencing as a first-tier screening test for short tandem repeat expansions. 2021

Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
Department of Medical Genetics, University of British Columbia and Children's & Women's Hospital, Vancouver, BC, V6H3N1, Canada. indhu.babu@bcchr.ca.

Screening for short tandem repeat (STR) expansions in next-generation sequencing data can enable diagnosis, optimal clinical management/treatment, and accurate genetic counseling of patients with repeat expansion disorders. We aimed to develop an efficient computational workflow for reliable detection of STR expansions in next-generation sequencing data and demonstrate its clinical utility. We characterized the performance of eight STR analysis methods (lobSTR, HipSTR, RepeatSeq, ExpansionHunter, TREDPARSE, GangSTR, STRetch, and exSTRa) on next-generation sequencing datasets of samples with known disease-causing full-mutation STR expansions and genomes simulated to harbor repeat expansions at selected loci and optimized their sensitivity. We then used a machine learning decision tree classifier to identify an optimal combination of methods for full-mutation detection. In Burrows-Wheeler Aligner (BWA)-aligned genomes, the ensemble approach of using ExpansionHunter, STRetch, and exSTRa performed the best (precision = 82%, recall = 100%, F1-score = 90%). We applied this pipeline to screen 301 families of children with suspected genetic disorders. We identified 10 individuals with full-mutations in the AR, ATXN1, ATXN8, DMPK, FXN, or HTT disease STR locus in the analyzed families. Additional candidates identified in our analysis include two probands with borderline ATXN2 expansions between the established repeat size range for reduced-penetrance and full-penetrance full-mutation and seven individuals with FMR1 CGG repeats in the intermediate/premutation repeat size range. In 67 probands with a prior negative clinical PCR test for the FMR1, FXN, or DMPK disease STR locus, or the spinocerebellar ataxia disease STR panel, our pipeline did not falsely identify aberrant expansion. We performed clinical PCR tests on seven (out of 10) full-mutation samples identified by our pipeline and confirmed the expansion status in all, showing absolute concordance between our bioinformatics and molecular findings. We have successfully demonstrated the application of a well-optimized bioinformatics pipeline that promotes the utility of genome-wide sequencing as a first-tier screening test to detect expansions of known disease STRs. Interrogating clinical next-generation sequencing data for pathogenic STR expansions using our ensemble pipeline can improve diagnostic yield and enhance clinical outcomes for patients with repeat expansion disorders.

UI MeSH Term Description Entries
D009154 Mutation Any detectable and heritable change in the genetic material that causes a change in the GENOTYPE and which is transmitted to daughter cells and to succeeding generations. Mutations
D003663 Decision Trees A graphic device used in decision analysis, series of decision options are represented as branches (hierarchical). Decision Tree,Tree, Decision,Trees, Decision
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000066491 Clinical Decision-Making Process of formulating a diagnosis based on medical history and physical or mental examinations, and/or choosing an appropriate intervention. Medical Decision-Making,Clinical Decision Making,Decision-Making, Clinical,Decision-Making, Medical,Medical Decision Making
D000069550 Machine Learning A type of ARTIFICIAL INTELLIGENCE that enable COMPUTERS to independently initiate and execute LEARNING when exposed to new data. Transfer Learning,Learning, Machine,Learning, Transfer
D000073336 Whole Genome Sequencing Techniques to determine the entire sequence of the GENOME of an organism or individual. Complete Genome Sequencing,Genome Sequencing, Complete,Genome Sequencing, Whole,Sequencing, Complete Genome,Sequencing, Whole Genome
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D000483 Alleles Variant forms of the same gene, occupying the same locus on homologous CHROMOSOMES, and governing the variants in production of the same gene product. Allelomorphs,Allele,Allelomorph
D015203 Reproducibility of Results The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results. Reliability and Validity,Reliability of Result,Reproducibility Of Result,Reproducibility of Finding,Validity of Result,Validity of Results,Face Validity,Reliability (Epidemiology),Reliability of Results,Reproducibility of Findings,Test-Retest Reliability,Validity (Epidemiology),Finding Reproducibilities,Finding Reproducibility,Of Result, Reproducibility,Of Results, Reproducibility,Reliabilities, Test-Retest,Reliability, Test-Retest,Result Reliabilities,Result Reliability,Result Validities,Result Validity,Result, Reproducibility Of,Results, Reproducibility Of,Test Retest Reliability,Validity and Reliability,Validity, Face
D042622 DNA Repeat Expansion An increase number of repeats of a genomic, tandemly repeated DNA sequence from one generation to the next. Expanded DNA Repeats,DNA Repeat Expansions,DNA Repeat, Expanded,DNA Repeats, Expanded,Expanded DNA Repeat,Expansion, DNA Repeat,Expansions, DNA Repeat,Repeat Expansion, DNA,Repeat Expansions, DNA,Repeat, Expanded DNA,Repeats, Expanded DNA

Related Publications

Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
December 2020, BMC bioinformatics,
Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
June 2022, Scientific reports,
Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
September 2022, Molecular psychiatry,
Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
September 2019, Nucleic acids research,
Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
April 2024, Cell,
Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
January 2023, European journal of human genetics : EJHG,
Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
September 2023, JAMA network open,
Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
May 2023, Genome research,
Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
January 2024, Genetics in medicine : official journal of the American College of Medical Genetics,
Indhu-Shree Rajan-Babu, and Junran J Peng, and Readman Chiu, and , and , and Chenkai Li, and Arezoo Mohajeri, and Egor Dolzhenko, and Michael A Eberle, and Inanc Birol, and Jan M Friedman
December 2019, Nature biotechnology,
Copied contents to your clipboard!