General power and sample size calculations for high-dimensional genomic data. 2013

Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands.

In the design of microarray or next-generation sequencing experiments it is crucial to choose the appropriate number of biological replicates. As often the number of differentially expressed genes and their effect sizes are small and too few replicates will lead to insufficient power to detect these. On the other hand, too many replicates unnecessary leads to high experimental costs. Power and sample size analysis can guide experimentalist in choosing the appropriate number of biological replicates. Several methods for power and sample size analysis have recently been proposed for microarray data. However, most of these are restricted to two group comparisons and require user-defined effect sizes. Here we propose a pilot-data based method for power and sample size analysis which can handle more general experimental designs and uses pilot-data to obtain estimates of the effect sizes. The method can also handle χ2 distributed test statistics which enables power and sample size calculations for a much wider class of models, including high-dimensional generalized linear models which are used, e.g., for RNA-seq data analysis. The performance of the method is evaluated using simulated and experimental data from several microarray and next-generation sequencing experiments. Furthermore, we compare our proposed method for estimation of the density of effect sizes from pilot data with a recent proposed method specific for two group comparisons.

UI MeSH Term Description Entries
D008957 Models, Genetic Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment. Genetic Models,Genetic Model,Model, Genetic
D003627 Data Interpretation, Statistical Application of statistical procedures to analyze specific observed or assumed facts from a particular study. Data Analysis, Statistical,Data Interpretations, Statistical,Interpretation, Statistical Data,Statistical Data Analysis,Statistical Data Interpretation,Analyses, Statistical Data,Analysis, Statistical Data,Data Analyses, Statistical,Interpretations, Statistical Data,Statistical Data Analyses,Statistical Data Interpretations
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D000818 Animals Unicellular or multicellular, heterotrophic organisms, that have sensation and the power of voluntary movement. Under the older five kingdom paradigm, Animalia was one of the kingdoms. Under the modern three domain model, Animalia represents one of the many groups in the domain EUKARYOTA. Animal,Metazoa,Animalia
D016014 Linear Models Statistical models in which the value of a parameter for a given value of a factor is assumed to be equal to a + bx, where a and b are constants. The models predict a linear regression. Linear Regression,Log-Linear Models,Models, Linear,Linear Model,Linear Regressions,Log Linear Models,Log-Linear Model,Model, Linear,Model, Log-Linear,Models, Log-Linear,Regression, Linear,Regressions, Linear
D017423 Sequence Analysis, RNA A multistage process that includes cloning, physical mapping, subcloning, sequencing, and information analysis of an RNA SEQUENCE. RNA Sequence Analysis,Sequence Determination, RNA,Analysis, RNA Sequence,Determination, RNA Sequence,Determinations, RNA Sequence,RNA Sequence Determination,RNA Sequence Determinations,RNA Sequencing,Sequence Determinations, RNA,Analyses, RNA Sequence,RNA Sequence Analyses,Sequence Analyses, RNA,Sequencing, RNA
D059014 High-Throughput Nucleotide Sequencing Techniques of nucleotide sequence analysis that increase the range, complexity, sensitivity, and accuracy of results by greatly increasing the scale of operations and thus the number of nucleotides, and the number of copies of each nucleotide sequenced. The sequencing may be done by analysis of the synthesis or ligation products, hybridization to preexisting sequences, etc. High-Throughput Sequencing,Illumina Sequencing,Ion Proton Sequencing,Ion Torrent Sequencing,Next-Generation Sequencing,Deep Sequencing,High-Throughput DNA Sequencing,High-Throughput RNA Sequencing,Massively-Parallel Sequencing,Pyrosequencing,DNA Sequencing, High-Throughput,High Throughput DNA Sequencing,High Throughput Nucleotide Sequencing,High Throughput RNA Sequencing,High Throughput Sequencing,Massively Parallel Sequencing,Next Generation Sequencing,Nucleotide Sequencing, High-Throughput,RNA Sequencing, High-Throughput,Sequencing, Deep,Sequencing, High-Throughput,Sequencing, High-Throughput DNA,Sequencing, High-Throughput Nucleotide,Sequencing, High-Throughput RNA,Sequencing, Illumina,Sequencing, Ion Proton,Sequencing, Ion Torrent,Sequencing, Massively-Parallel,Sequencing, Next-Generation
D018401 Sample Size The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. (From Wassertheil-Smoller, Biostatistics and Epidemiology, 1990, p95) Sample Sizes,Size, Sample,Sizes, Sample
D020411 Oligonucleotide Array Sequence Analysis Hybridization of a nucleic acid sample to a very large set of OLIGONUCLEOTIDE PROBES, which have been attached individually in columns and rows to a solid support, to determine a BASE SEQUENCE, or to detect variations in a gene sequence, GENE EXPRESSION, or for GENE MAPPING. DNA Microarrays,Gene Expression Microarray Analysis,Oligonucleotide Arrays,cDNA Microarrays,DNA Arrays,DNA Chips,DNA Microchips,Gene Chips,Oligodeoxyribonucleotide Array Sequence Analysis,Oligonucleotide Microarrays,Sequence Analysis, Oligonucleotide Array,cDNA Arrays,Array, DNA,Array, Oligonucleotide,Array, cDNA,Arrays, DNA,Arrays, Oligonucleotide,Arrays, cDNA,Chip, DNA,Chip, Gene,Chips, DNA,Chips, Gene,DNA Array,DNA Chip,DNA Microarray,DNA Microchip,Gene Chip,Microarray, DNA,Microarray, Oligonucleotide,Microarray, cDNA,Microarrays, DNA,Microarrays, Oligonucleotide,Microarrays, cDNA,Microchip, DNA,Microchips, DNA,Oligonucleotide Array,Oligonucleotide Microarray,cDNA Array,cDNA Microarray

Related Publications

Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
June 2001, Controlled clinical trials,
Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
November 2010, Statistics in medicine,
Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
November 2018, Briefings in bioinformatics,
Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
August 2012, Epidemiologic methods,
Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
June 1993, Biometrics,
Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
July 2003, Multivariate behavioral research,
Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
December 1993, Statistics in medicine,
Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
July 2001, Statistics in medicine,
Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
April 2008, Biometrical journal. Biometrische Zeitschrift,
Maarten van Iterson, and Mark A van de Wiel, and Judith M Boer, and Renée X de Menezes
November 1989, Psychological bulletin,
Copied contents to your clipboard!