Removing batch effects in analysis of expression microarray data: an evaluation of six batch adjustment methods. 2011

Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
National Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, People's Republic of China.

The expression microarray is a frequently used approach to study gene expression on a genome-wide scale. However, the data produced by the thousands of microarray studies published annually are confounded by "batch effects," the systematic error introduced when samples are processed in multiple batches. Although batch effects can be reduced by careful experimental design, they cannot be eliminated unless the whole study is done in a single batch. A number of programs are now available to adjust microarray data for batch effects prior to analysis. We systematically evaluated six of these programs using multiple measures of precision, accuracy and overall performance. ComBat, an Empirical Bayes method, outperformed the other five programs by most metrics. We also showed that it is essential to standardize expression data at the probe level when testing for correlation of expression profiles, due to a sizeable probe effect in microarray data that can inflate the correlation among replicates and unrelated samples.

UI MeSH Term Description Entries
D012015 Reference Standards A basis of value established for the measure of quantity, weight, extent or quality, e.g. weight standards, standard solutions, methods, techniques, and procedures used in diagnosis and therapy. Standard Preparations,Standards, Reference,Preparations, Standard,Standardization,Standards,Preparation, Standard,Reference Standard,Standard Preparation,Standard, Reference
D012107 Research Design A plan for collecting and utilizing data so that desired information can be obtained with sufficient precision or so that an hypothesis can be tested properly. Experimental Design,Data Adjustment,Data Reporting,Design, Experimental,Designs, Experimental,Error Sources,Experimental Designs,Matched Groups,Methodology, Research,Problem Formulation,Research Methodology,Research Proposal,Research Strategy,Research Technics,Research Techniques,Scoring Methods,Adjustment, Data,Adjustments, Data,Data Adjustments,Design, Research,Designs, Research,Error Source,Formulation, Problem,Formulations, Problem,Group, Matched,Groups, Matched,Matched Group,Method, Scoring,Methods, Scoring,Problem Formulations,Proposal, Research,Proposals, Research,Reporting, Data,Research Designs,Research Proposals,Research Strategies,Research Technic,Research Technique,Scoring Method,Source, Error,Sources, Error,Strategies, Research,Strategy, Research,Technic, Research,Technics, Research,Technique, Research,Techniques, Research
D003627 Data Interpretation, Statistical Application of statistical procedures to analyze specific observed or assumed facts from a particular study. Data Analysis, Statistical,Data Interpretations, Statistical,Interpretation, Statistical Data,Statistical Data Analysis,Statistical Data Interpretation,Analyses, Statistical Data,Analysis, Statistical Data,Data Analyses, Statistical,Interpretations, Statistical Data,Statistical Data Analyses,Statistical Data Interpretations
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D001499 Bayes Theorem A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result. Bayesian Analysis,Bayesian Estimation,Bayesian Forecast,Bayesian Method,Bayesian Prediction,Analysis, Bayesian,Bayesian Approach,Approach, Bayesian,Approachs, Bayesian,Bayesian Approachs,Estimation, Bayesian,Forecast, Bayesian,Method, Bayesian,Prediction, Bayesian,Theorem, Bayes
D012372 ROC Curve A graphic means for assessing the ability of a screening test to discriminate between healthy and diseased persons; may also be used in other studies, e.g., distinguishing stimuli responses as to a faint stimuli or nonstimuli. ROC Analysis,Receiver Operating Characteristic,Analysis, ROC,Analyses, ROC,Characteristic, Receiver Operating,Characteristics, Receiver Operating,Curve, ROC,Curves, ROC,ROC Analyses,ROC Curves,Receiver Operating Characteristics
D015983 Selection Bias The introduction of error due to systematic differences in the characteristics between those selected and those not selected for a given study. In sampling bias, error is the result of failure to ensure that all members of the reference population have a known chance of selection in the sample. Bias, Selection,Sampling Bias,Sampling Biases,Sampling Error,Selection Biases,Bias, Sampling,Biases, Sampling,Biases, Selection,Error, Sampling,Errors, Sampling,Sampling Errors
D016022 Case-Control Studies Comparisons that start with the identification of persons with the disease or outcome of interest and a control (comparison, referent) group without the disease or outcome of interest. The relationship of an attribute is examined by comparing both groups with regard to the frequency or levels of outcome over time. Case-Base Studies,Case-Comparison Studies,Case-Referent Studies,Matched Case-Control Studies,Nested Case-Control Studies,Case Control Studies,Case-Compeer Studies,Case-Referrent Studies,Case Base Studies,Case Comparison Studies,Case Control Study,Case Referent Studies,Case Referrent Studies,Case-Comparison Study,Case-Control Studies, Matched,Case-Control Studies, Nested,Case-Control Study,Case-Control Study, Matched,Case-Control Study, Nested,Case-Referent Study,Case-Referrent Study,Matched Case Control Studies,Matched Case-Control Study,Nested Case Control Studies,Nested Case-Control Study,Studies, Case Control,Studies, Case-Base,Studies, Case-Comparison,Studies, Case-Compeer,Studies, Case-Control,Studies, Case-Referent,Studies, Case-Referrent,Studies, Matched Case-Control,Studies, Nested Case-Control,Study, Case Control,Study, Case-Comparison,Study, Case-Control,Study, Case-Referent,Study, Case-Referrent,Study, Matched Case-Control,Study, Nested Case-Control
D046228 Microarray Analysis The simultaneous analysis, on a microchip, of multiple samples or targets arranged in an array format. Microarray Analytical Devices,Microarray Microchips,Nanoarray Analytical Devices,Analysis, Microarray,Analytical Device, Microarray,Analytical Device, Nanoarray,Analytical Devices, Microarray,Analytical Devices, Nanoarray,Device, Microarray Analytical,Device, Nanoarray Analytical,Devices, Microarray Analytical,Devices, Nanoarray Analytical,Microarray Analytical Device,Microarray Microchip,Microchip, Microarray,Microchips, Microarray,Nanoarray Analytical Device
D054928 Validation Studies as Topic Works on research using processes by which the reliability and relevance of a procedure for a specific purpose are established.

Related Publications

Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
January 2007, Biostatistics (Oxford, England),
Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
January 2016, PloS one,
Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
July 2013, Briefings in bioinformatics,
Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
July 2015, BMC bioinformatics,
Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
December 2004, Bioinformatics (Oxford, England),
Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
January 2022, Briefings in bioinformatics,
Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
July 2014, Statistics in medicine,
Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
February 2003, Journal of biopharmaceutical statistics,
Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
January 2008, Cancer informatics,
Chao Chen, and Kay Grennan, and Judith Badner, and Dandan Zhang, and Elliot Gershon, and Li Jin, and Chunyu Liu
January 2011, BMC bioinformatics,
Copied contents to your clipboard!