Machine Learning Models for Data-Driven Prediction of Diabetes by Lifestyle Type. 2022

Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
College of Physical Education, Shenzhen University, Shenzhen 518000, China.

The prevalence of diabetes has been increasing in recent years, and previous research has found that machine-learning models are good diabetes prediction tools. The purpose of this study was to compare the efficacy of five different machine-learning models for diabetes prediction using lifestyle data from the National Health and Nutrition Examination Survey (NHANES) database. The 1999-2020 NHANES database yielded data on 17,833 individuals data based on demographic characteristics and lifestyle-related variables. To screen training data for machine models, the Akaike Information Criterion (AIC) forward propagation algorithm was utilized. For predicting diabetes, five machine-learning models (CATBoost, XGBoost, Random Forest (RF), Logistic Regression (LR), and Support Vector Machine (SVM)) were developed. Model performance was evaluated using accuracy, sensitivity, specificity, precision, F1 score, and receiver operating characteristic (ROC) curve. Among the five machine-learning models, the dietary intake levels of energy, carbohydrate, and fat, contributed the most to the prediction of diabetes patients. In terms of model performance, CATBoost ranks higher than RF, LG, XGBoost, and SVM. The best-performing machine-learning model among the five is CATBoost, which achieves an accuracy of 82.1% and an AUC of 0.83. Machine-learning models based on NHANES data can assist medical institutions in identifying diabetes patients.

UI MeSH Term Description Entries
D008019 Life Style Typical way of life or manner of living characteristic of an individual or group. (From APA, Thesaurus of Psychological Index Terms, 8th ed) Lifestyle Factors,Life Style Induced Illness,Lifestyle,Factor, Lifestyle,Life Styles,Lifestyle Factor,Lifestyles
D009749 Nutrition Surveys A systematic collection of factual data pertaining to the nutritional status of a human population within a given geographic area. Data from these surveys are used in preparing NUTRITION ASSESSMENTS. NHANES,National Health and Nutrition Examination Survey,Nutritional Surveys,Nutrition Survey,Nutritional Survey,Survey, Nutrition,Survey, Nutritional,Surveys, Nutrition,Surveys, Nutritional
D003920 Diabetes Mellitus A heterogeneous group of disorders characterized by HYPERGLYCEMIA and GLUCOSE INTOLERANCE.
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000069550 Machine Learning A type of ARTIFICIAL INTELLIGENCE that enable COMPUTERS to independently initiate and execute LEARNING when exposed to new data. Transfer Learning,Learning, Machine,Learning, Transfer
D016015 Logistic Models Statistical models which describe the relationship between a qualitative dependent variable (that is, one which can take only certain discrete values, such as the presence or absence of a disease) and an independent variable. A common application is in epidemiology for estimating an individual's risk (probability of a disease) as a function of a given risk factor. Logistic Regression,Logit Models,Models, Logistic,Logistic Model,Logistic Regressions,Logit Model,Model, Logistic,Model, Logit,Models, Logit,Regression, Logistic,Regressions, Logistic

Related Publications

Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
July 2022, Sensors (Basel, Switzerland),
Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
January 2023, Sensors (Basel, Switzerland),
Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
January 2023, Scientific data,
Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
January 2015, Journal of diabetes science and technology,
Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
September 2019, Preventing chronic disease,
Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
July 2019, Artificial intelligence in medicine,
Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
June 2022, Journal of diabetes and metabolic disorders,
Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
December 2022, ACS applied materials & interfaces,
Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
July 2022, Journal of personalized medicine,
Yifan Qin, and Jinlong Wu, and Wen Xiao, and Kun Wang, and Anbing Huang, and Bowen Liu, and Jingxuan Yu, and Chuhao Li, and Fengyu Yu, and Zhanbing Ren
November 2020, Scientific reports,
Copied contents to your clipboard!