Estimates of introgression as a function of pairwise distances. 2019

Bastian Pfeifer, and Durrell D Kapan
Institute for Medical Informatics, Statistics and Documentation, Medical University, Graz, Austria.

BACKGROUND Research over the last 10 years highlights the increasing importance of hybridization between species as a major force structuring the evolution of genomes and potentially providing raw material for adaptation by natural and/or sexual selection. Fueled by research in a few model systems where phenotypic hybrids are easily identified, research into hybridization and introgression (the flow of genes between species) has exploded with the advent of whole-genome sequencing and emerging methods to detect the signature of hybridization at the whole-genome or chromosome level. Amongst these are a general class of methods that utilize patterns of single-nucleotide polymorphisms (SNPs) across a tree as markers of hybridization. These methods have been applied to a variety of genomic systems ranging from butterflies to Neanderthals to detect introgression, however, when employed at a fine genomic scale these methods do not perform well to quantify introgression in small sample windows. RESULTS We introduce a novel method to detect introgression by combining two widely used statistics: pairwise nucleotide diversity dxy and Patterson's D. The resulting statistic, the distance fraction (df), accounts for genetic distance across possible topologies and is designed to simultaneously detect and quantify introgression. We also relate our new method to the recently published fd and incorporate these statistics into the powerful genomics R-package PopGenome, freely available on GitHub (pievos101/PopGenome) and the Comprehensive R Archive Network (CRAN). The supplemental material contains a wide range of simulation studies and a detailed manual how to perform the statistics within the PopGenome framework. CONCLUSIONS We present a new distance based statistic df that avoids the pitfalls of Patterson's D when applied to small genomic regions and accurately quantifies the fraction of introgression (f) for a wide range of simulation scenarios.

UI MeSH Term Description Entries
D008957 Models, Genetic Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment. Genetic Models,Genetic Model,Model, Genetic
D006824 Hybridization, Genetic The genetic process of crossbreeding between genetically dissimilar parents to produce a hybrid. Crossbreeding,Hybridization, Intraspecies,Crossbreedings,Genetic Hybridization,Genetic Hybridizations,Hybridizations, Genetic,Hybridizations, Intraspecies,Intraspecies Hybridization,Intraspecies Hybridizations
D000073336 Whole Genome Sequencing Techniques to determine the entire sequence of the GENOME of an organism or individual. Complete Genome Sequencing,Genome Sequencing, Complete,Genome Sequencing, Whole,Sequencing, Complete Genome,Sequencing, Whole Genome
D015233 Models, Statistical Statistical formulations or analyses which, when applied to data and found to fit the data, are then used to verify the assumptions and parameters used in the analysis. Examples of statistical models are the linear model, binomial model, polynomial model, two-parameter model, etc. Probabilistic Models,Statistical Models,Two-Parameter Models,Model, Statistical,Models, Binomial,Models, Polynomial,Statistical Model,Binomial Model,Binomial Models,Model, Binomial,Model, Polynomial,Model, Probabilistic,Model, Two-Parameter,Models, Probabilistic,Models, Two-Parameter,Polynomial Model,Polynomial Models,Probabilistic Model,Two Parameter Models,Two-Parameter Model
D051456 Gene Flow The change in gene frequency in a population due to migration of gametes or individuals (ANIMAL MIGRATION) across population barriers. In contrast, in GENETIC DRIFT the cause of gene frequency changes are not a result of population or gamete movement. Flow, Gene
D020641 Polymorphism, Single Nucleotide A single nucleotide variation in a genetic sequence that occurs at appreciable frequency in the population. SNPs,Single Nucleotide Polymorphism,Nucleotide Polymorphism, Single,Nucleotide Polymorphisms, Single,Polymorphisms, Single Nucleotide,Single Nucleotide Polymorphisms
D023281 Genomics The systematic study of the complete DNA sequences (GENOME) of organisms. Included is construction of complete genetic, physical, and transcript maps, and the analysis of this structural genomic information on a global scale such as in GENOME WIDE ASSOCIATION STUDIES. Functional Genomics,Structural Genomics,Comparative Genomics,Genomics, Comparative,Genomics, Functional,Genomics, Structural
D030541 Databases, Genetic Databases devoted to knowledge about specific genes and gene products. Genetic Databases,Genetic Sequence Databases,OMIM,Online Mendelian Inheritance In Man,Genetic Data Banks,Genetic Data Bases,Genetic Databanks,Genetic Information Databases,Bank, Genetic Data,Banks, Genetic Data,Data Bank, Genetic,Data Banks, Genetic,Data Base, Genetic,Data Bases, Genetic,Databank, Genetic,Databanks, Genetic,Database, Genetic,Database, Genetic Information,Database, Genetic Sequence,Databases, Genetic Information,Databases, Genetic Sequence,Genetic Data Bank,Genetic Data Base,Genetic Databank,Genetic Database,Genetic Information Database,Genetic Sequence Database,Information Database, Genetic,Information Databases, Genetic,Sequence Database, Genetic,Sequence Databases, Genetic

Related Publications

Bastian Pfeifer, and Durrell D Kapan
June 2017, Journal of theoretical biology,
Bastian Pfeifer, and Durrell D Kapan
March 2006, Molecular biology and evolution,
Bastian Pfeifer, and Durrell D Kapan
December 2009, Bioinformatics (Oxford, England),
Bastian Pfeifer, and Durrell D Kapan
September 2013, Neural computation,
Bastian Pfeifer, and Durrell D Kapan
June 2014, Genetics and molecular research : GMR,
Bastian Pfeifer, and Durrell D Kapan
January 2015, PloS one,
Bastian Pfeifer, and Durrell D Kapan
June 2008, BMC evolutionary biology,
Bastian Pfeifer, and Durrell D Kapan
June 2010, Biochemical genetics,
Bastian Pfeifer, and Durrell D Kapan
September 1972, Journal of experimental psychology,
Copied contents to your clipboard!