pyComBat, a Python tool for batch effects correction in high-throughput molecular data using empirical Bayes methods. 2023

Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
Epigene Labs, Paris, France. abdelkader@epigenelabs.com.

BACKGROUND Variability in datasets is not only the product of biological processes: they are also the product of technical biases. ComBat and ComBat-Seq are among the most widely used tools for correcting those technical biases, called batch effects, in, respectively, microarray and RNA-Seq expression data. RESULTS In this technical note, we present a new Python implementation of ComBat and ComBat-Seq. While the mathematical framework is strictly the same, we show here that our implementations: (i) have similar results in terms of batch effects correction; (ii) are as fast or faster than the original implementations in R and; (iii) offer new tools for the bioinformatics community to participate in its development. pyComBat is implemented in the Python language and is distributed under GPL-3.0 ( https://www.gnu.org/licenses/gpl-3.0.en.html ) license as a module of the inmoose package. Source code is available at https://github.com/epigenelabs/inmoose and Python package at https://pypi.org/project/inmoose . CONCLUSIONS We present a new Python implementation of state-of-the-art tools ComBat and ComBat-Seq for the correction of batch effects in microarray and RNA-Seq data. This new implementation, based on the same mathematical frameworks as ComBat and ComBat-Seq, offers similar power for batch effect correction, at reduced computational cost.

UI MeSH Term Description Entries
D000081246 RNA-Seq High-throughput nucleotide sequencing techniques developed for determining and analyzing the composition of the TRANSCRIPTOME of a sample. Whole Transcriptome Shotgun Sequencing
D001499 Bayes Theorem A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result. Bayesian Analysis,Bayesian Estimation,Bayesian Forecast,Bayesian Method,Bayesian Prediction,Analysis, Bayesian,Bayesian Approach,Approach, Bayesian,Approachs, Bayesian,Bayesian Approachs,Estimation, Bayesian,Forecast, Bayesian,Method, Bayesian,Prediction, Bayesian,Theorem, Bayes
D012984 Software Sequential operating programs and data which instruct the functioning of a digital computer. Computer Programs,Computer Software,Open Source Software,Software Engineering,Software Tools,Computer Applications Software,Computer Programs and Programming,Computer Software Applications,Application, Computer Software,Applications Software, Computer,Applications Softwares, Computer,Applications, Computer Software,Computer Applications Softwares,Computer Program,Computer Software Application,Engineering, Software,Open Source Softwares,Program, Computer,Programs, Computer,Software Application, Computer,Software Applications, Computer,Software Tool,Software, Computer,Software, Computer Applications,Software, Open Source,Softwares, Computer Applications,Softwares, Open Source,Source Software, Open,Source Softwares, Open,Tool, Software,Tools, Software
D019295 Computational Biology A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets. Bioinformatics,Molecular Biology, Computational,Bio-Informatics,Biology, Computational,Computational Molecular Biology,Bio Informatics,Bio-Informatic,Bioinformatic,Biologies, Computational Molecular,Biology, Computational Molecular,Computational Molecular Biologies,Molecular Biologies, Computational

Related Publications

Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
January 2007, Biostatistics (Oxford, England),
Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
July 2018, BMC bioinformatics,
Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
June 1991, Statistics in medicine,
Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
December 2021, Statistical applications in genetics and molecular biology,
Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
February 2005, Statistical methods in medical research,
Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
February 2019, Journal of proteome research,
Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
January 2016, Bioinformatics (Oxford, England),
Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
February 2004, Journal of biomolecular screening,
Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
January 2008, Epidemiologia e prevenzione,
Abdelkader Behdenna, and Maximilien Colange, and Julien Haziza, and Aryo Gema, and Guillaume Appé, and Chloé-Agathe Azencott, and Akpéli Nordor
January 2021, Journal of open research software,
Copied contents to your clipboard!