In-depth analysis of machine learning models and explainable artificial intelligence methods in diabetes diagnosis Diyabet hastalığı teşhisinde makine öğrenimi modelleri ile açıklanabilir yapay zeka yöntemlerinin analizi


Güler H., Avcı D., Ulaş M., Omma T.

Journal of the Faculty of Engineering and Architecture of Gazi University, cilt.40, sa.3, ss.1995-2011, 2025 (SCI-Expanded, Scopus, TRDizin) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 40 Sayı: 3
  • Basım Tarihi: 2025
  • Doi Numarası: 10.17341/gazimmfd.1552790
  • Dergi Adı: Journal of the Faculty of Engineering and Architecture of Gazi University
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Art Source, Compendex, TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.1995-2011
  • Anahtar Kelimeler: Diabetes, Explainable Artificial Intelligence, LIME, SHAP, XGBoost
  • Lokman Hekim Üniversitesi Adresli: Evet

Özet

With the rise of large datasets in the healthcare sector, machine learning methods have gained significant importance in analyzing, predicting, and discovering patterns within diabetes datasets. This study focuses on the early diagnosis of diabetes by comparing the performance of seven machine learning models and exploring the impact of Explainable Artificial Intelligence (XAI) techniques. The models—K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Naive Bayes, Artificial Neural Networks (ANN), Decision Trees, Random Forest, and XGBoost—were evaluated using a well-structured pipeline that included data cleaning, preprocessing, training, and testing stages. Performance metrics such as accuracy, F1 score, sensitivity, and specificity were applied for robust evaluation. Unlike many previous studies, this research integrates XAI methods like SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-Agnostic Explanations) to enhance the interpretability of the best-performing model. These techniques identified critical features contributing to the model's decisions, enabling better insights into the decision-making process. Additionally, the findings were validated through expert opinions to ensure real-world applicability. The results demonstrated significant improvements, with XGBoost achieving an accuracy rate of 98.91%, outperforming the KNN (81.18%), SVM (75.38%), Naive Bayes (75.49%), ANN (74.83%), Decision Trees (76.91%), and Random Forest (91.68%) models. This study highlights the potential of integrating machine learning with XAI techniques for transparent and effective diabetes diagnosis.