Linear regression in genetic association studies. 2013

Petra Bůžková
Department of Biostatistics, University of Washington, Seattle, WA, USA. buzkova@u.washington.edu

In genomic research phenotype transformations are commonly used as a straightforward way to reach normality of the model outcome. Many researchers still believe it to be necessary for proper inference. Using regression simulations, we show that phenotype transformations are typically not needed and, when used in phenotype with heteroscedasticity, result in inflated Type I error rates. We further explain that important is to address a combination of rare variant genotypes and heteroscedasticity. Incorrectly estimated parameter variability or incorrect choice of the distribution of the underlying test statistic provide spurious detection of associations. We conclude that it is a combination of heteroscedasticity, minor allele frequency, sample size, and to a much lesser extent the error distribution, that matter for proper statistical inference.

UI MeSH Term Description Entries
D008297 Male Males
D008957 Models, Genetic Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment. Genetic Models,Genetic Model,Model, Genetic
D010641 Phenotype The outward appearance of the individual. It is the product of interactions between genes, and between the GENOTYPE and the environment. Phenotypes
D005260 Female Females
D005838 Genotype The genetic constitution of the individual, comprising the ALLELES present at each GENETIC LOCUS. Genogroup,Genogroups,Genotypes
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D016014 Linear Models Statistical models in which the value of a parameter for a given value of a factor is assumed to be equal to a + bx, where a and b are constants. The models predict a linear regression. Linear Regression,Log-Linear Models,Models, Linear,Linear Model,Linear Regressions,Log Linear Models,Log-Linear Model,Model, Linear,Model, Log-Linear,Models, Log-Linear,Regression, Linear,Regressions, Linear
D056726 Genetic Association Studies The analysis of a sequence such as a region of a chromosome, a haplotype, a gene, or an allele for its involvement in controlling the phenotype of a specific trait, metabolic pathway, or disease. Candidate Gene Identification,Candidate Gene Analysis,Candidate Gene Association Studies,Candidate Gene Association Study,Gene Discovery,Genotype-Phenotype Association,Genotype-Phenotype Associations,Genotype-Phenotype Correlation,Genotype-Phenotype Correlations,Analyses, Candidate Gene,Analysis, Candidate Gene,Association Studies, Genetic,Association Study, Genetic,Association, Genotype-Phenotype,Associations, Genotype-Phenotype,Candidate Gene Analyses,Correlation, Genotype-Phenotype,Correlations, Genotype-Phenotype,Discovery, Gene,Gene Analyses, Candidate,Gene Analysis, Candidate,Gene Identification, Candidate,Genetic Association Study,Genotype Phenotype Association,Genotype Phenotype Associations,Genotype Phenotype Correlation,Genotype Phenotype Correlations,Identification, Candidate Gene,Studies, Genetic Association,Study, Genetic Association
Copied contents to your clipboard!