Predicting Type 2 Diabetes Mellitus using Machine Learning Algorithms


  • Nisreen Sulayman Damascus University


Type 2 Diabetes Mellitus, Machine Learning, XGBoost Model, Logistic Regression



Purpose: to build an effective prediction model based on machine learning (ML) algorithms for the risk of type 2 (non-insulin-dependent) Diabetes Mellitus (T2DM).

Methods: I developed two machine learning prediction models based on extreme gradient boosting (XGBoost) and logistic regression (LR). To evaluate the ML prediction models I used the Pima Indian Diabetes dataset (PIDD). The dataset is from the National Institute of Diabetes and Digestive and Kidney Diseases and consists of 500 non-diabetic patients and 268 diabetes patients.

Results: Models' performance was evaluated using six performance criteria. XGBoost model outperforms the logistic regression. The XGBoost model achieved: area under receiver operating characteristic curve (AUROC) = 85%, sensitivity = 71%, specificity = 81%, accuracy =77%, precision = 67%, and F1-score=69% respectively.

Conclusion: This study showed that the XGBoost ML algorithm can be applied to predict individuals at high risk of T2DM in the early phase, which has a strong potential to control diabetes mellitus.



How to Cite

Sulayman N. Predicting Type 2 Diabetes Mellitus using Machine Learning Algorithms. Tuj-eng [Internet]. 2022Nov.17 [cited 2024Feb.28];44(5):89-100. Available from: