Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. 2020

Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
Research Institute for Signals, Systems and Computational Intelligence sinc(i), Universidad Nacional del Litoral-Consejo Nacional de Investigaciones Científicas y Técnicas CONICET, Santa Fe CP3000, Argentina.

Artificial intelligence (AI) systems for computer-aided diagnosis and image-based screening are being adopted worldwide by medical institutions. In such a context, generating fair and unbiased classifiers becomes of paramount importance. The research community of medical image computing is making great efforts in developing more accurate algorithms to assist medical doctors in the difficult task of disease diagnosis. However, little attention is paid to the way databases are collected and how this may influence the performance of AI systems. Our study sheds light on the importance of gender balance in medical imaging datasets used to train AI systems for computer-assisted diagnosis. We provide empirical evidence supported by a large-scale study, based on three deep neural network architectures and two well-known publicly available X-ray image datasets used to diagnose various thoracic diseases under different gender imbalance conditions. We found a consistent decrease in performance for underrepresented genders when a minimum balance is not fulfilled. This raises the alarm for national agencies in charge of regulating and approving computer-assisted diagnosis systems, which should include explicit gender balance and diversity recommendations. We also establish an open problem for the academic medical image computing community which needs to be addressed by novel algorithms endowed with robustness to gender imbalance.

UI MeSH Term Description Entries
D008297 Male Males
D011857 Radiographic Image Interpretation, Computer-Assisted Computer systems or networks designed to provide radiographic interpretive information. Computer Assisted Radiographic Image Interpretation,Computer-Assisted Radiographic Image Interpretation,Radiographic Image Interpretation, Computer Assisted
D012015 Reference Standards A basis of value established for the measure of quantity, weight, extent or quality, e.g. weight standards, standard solutions, methods, techniques, and procedures used in diagnosis and therapy. Standard Preparations,Standards, Reference,Preparations, Standard,Standardization,Standards,Preparation, Standard,Reference Standard,Standard Preparation,Standard, Reference
D005260 Female Females
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000077321 Deep Learning Supervised or unsupervised machine learning methods that use multiple layers of data representations generated by nonlinear transformations, instead of individual task-specific ALGORITHMS, to build and train neural network models. Hierarchical Learning,Learning, Deep,Learning, Hierarchical
D012737 Sex Factors Maleness or femaleness as a constituent element or influence contributing to the production of a result. It may be applicable to the cause or effect of a circumstance. It is used with human or animal concepts but should be differentiated from SEX CHARACTERISTICS, anatomical or physiological manifestations of sex, and from SEX DISTRIBUTION, the number of males and females in given circumstances. Factor, Sex,Factors, Sex,Sex Factor
D013902 Radiography, Thoracic X-ray visualization of the chest and organs of the thoracic cavity. It is not restricted to visualization of the lungs. Thoracic Radiography,Radiographies, Thoracic,Thoracic Radiographies
D015982 Bias Any deviation of results or inferences from the truth, or processes leading to such deviation. Bias can result from several sources: one-sided or systematic variations in measurement from the true value (systematic error); flaws in study design; deviation of inferences, interpretations, or analyses based on flawed data or data collection; etc. There is no sense of prejudice or subjectivity implied in the assessment of bias under these conditions. Aggregation Bias,Bias, Aggregation,Bias, Ecological,Bias, Statistical,Bias, Systematic,Ecological Bias,Outcome Measurement Errors,Statistical Bias,Systematic Bias,Bias, Epidemiologic,Biases,Biases, Ecological,Biases, Statistical,Ecological Biases,Ecological Fallacies,Ecological Fallacy,Epidemiologic Biases,Experimental Bias,Fallacies, Ecological,Fallacy, Ecological,Scientific Bias,Statistical Biases,Truncation Bias,Truncation Biases,Bias, Experimental,Bias, Scientific,Bias, Truncation,Biase, Epidemiologic,Biases, Epidemiologic,Biases, Truncation,Epidemiologic Biase,Error, Outcome Measurement,Errors, Outcome Measurement,Outcome Measurement Error
D066264 Datasets as Topic Subject matter related to the curation of data from research projects, stored permanently in a formalized manner suitable for communication, interpretation, or processing. Dataset as Topic,Data Sets as Topic

Related Publications

Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
December 2001, IEEE transactions on medical imaging,
Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
November 2022, Sensors (Basel, Switzerland),
Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
January 2023, Journal of X-ray science and technology,
Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
May 2015, IEEE journal of biomedical and health informatics,
Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
August 1964, Journal of medical education,
Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
February 2006, Medical physics,
Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
April 2006, IEEE transactions on information technology in biomedicine : a publication of the IEEE Engineering in Medicine and Biology Society,
Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
January 1973, Methods of information in medicine. Supplement,
Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
August 1979, International journal of bio-medical computing,
Agostina J Larrazabal, and Nicolás Nieto, and Victoria Peterson, and Diego H Milone, and Enzo Ferrante
January 2005, The British journal of radiology,
Copied contents to your clipboard!