|Automated detection and classification of diabetes disease based on Bangladesh demographic and health survey data, 2011 using machine learning approach|
||Merajul Islam, Jahanur Rahman, Dulal Chandra Roy, and Maniruzzaman
||Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 14(3): 217-219; DOI: 10.1016/j.dsx.2020.03.004
||Background and aims
Diabetes has been recognized as a continuing health challenge for the twenty-first century, both in developed and developing countries including Bangladesh. The main objective of this study is to use machine learning (ML) based classifiers for automated detection and classification of diabetes.
The diabetes dataset have taken from Bangladesh demographic and health survey, 2011 data having 1569 respondents are 127 diabetes. Two statistical tests as independent t for continuous and chi-square for categorical variables are used to determine the risk factors of diabetes. Six ML-based classifiers as support vector machine, random forest, linear discriminant analysis, logistic regression, k-nearest neighborhood, bagged classification and regression tree (Bagged CART) have been adopted to predict and classify of diabetes.
Our findings show that 11 factors out of 15 factors are significantly associated with diabetes. Bagged CART provides the highest accuracy and area under the curve of 94.3% and 0.600.
Bagged CART anticipates a very supportive computational resource for classification of diabetes and it would be very helpful to the doctors for making a decision to control diabetes disease in Bangladesh.