Wednesday, October 4

Hidden cases of diabetes are discovered through machine learning among those with normal fasting glucose levels.

Using machine learning techniques, researchers identified diabetic individuals among populations with normal fasting glucose using common physical examination indexes in a recent study published in BMC Medicine.

Using machine learning, researchers were able to detect diabetic patients in individuals with normal fasting glucose using research funded by NicoElNino/Shutterstock.com.

What is the background?

The public health issue of diabetes mellitus (DM) is becoming more prevalent, with many undiagnosed cases going unnoticed and causing complications. The International Diabetes Federation predicted a rise from 537 million diabetic individuals in 2021 to 643 million by 2030.

The healthcare system is burdened by undiagnosed cases, leading to a preference for early diagnosis and the use of machine learning for efficient screening. However, fasting blood glucose can be too inconvenient to diagnose many cases accurately due to its accuracy in predicting risk.

The prevalence of normal fasting glucose in diabetic patients highlights the need for broader screening methods and research to improve detection across different segments.

What is the study’s subject matter?

To create a model for identifying diabetic patients with normal fasting glucose, the current study utilized physical examination data from three hospitals. These data were classified as D1, D2, and D3 and were subjected to rigorous cleaning procedures along with samples classified according to the WHO’s diabetes diagnostic criteria.

The datasets were skewered by class imbalance, prompting the use of SMOTE (Synthetic Minority Oversampling technique), which normalized Z-scores for standardization.

The computational model utilized various machine learning methods, with the deep neural network (DNN) being the most suited. Established metrics like sensitivity and accuracy were used to refine the model, considering the data’s extensive class separation.

Initially, 27 features were used for predictions, but there was a drive to optimize this by eliminating potential redundancies. This involved considering 13 crucial features, which were identified through manual curation and the max relevance and min redunancy (mRMR) analysis.

An online tool called DRING was developed for practical use. The study expanded on previous research by introducing a method that utilized the permutation feature importance algorithm, which provides individualized risk assessment for diabetes onset.

The findings of the study were published in academic journals.

The First Affiliated Hospital of Wannan Medical College analyzed 61,059 samples with normal fasting glucose (NFG) through physical examination data between 2015 and 2018.

The Hemoglobin A1c (HbA1d) level threshold was set at 6.5%, and nearly 1% (603 participants) were classified as diabetic. Additionally, the diabetical group had an average Body Mass Index (BMI) of 1.08 units higher and was 10.6 years older than the non-diabetic group.

Absolute lymphocyte count (ALC), age, FBG, BMI, and white blood cell count were the primary markers of diabetes, but an additional 11 important factors were also observed.

The model’s stability was dependent on the elimination of redundancy, which was a result of the strong correlation between hemoglobin (HGB) and hematocrit (HCT) or neutrophil (NEU) AND lymphocyte (LYM).

The use of manual curation and the mRMR method resulted in an optimal feature space. Thirteen out of the initial 27 features were chosen, with each focusing on different factors such as FBG, BMI, ALC, and age. In tests, models built with 13 features performed slightly better than those with 27, showing increased precision and sensitivity.

Additional testing was conducted on two independent test sets, D2 and D3. The AUC values of both models were above 0.95 on D2, and almost equal to 0.90 on the same scale. Additionally, the Youden’s (or J) index on this model was notably high. Manual curation-based models generally performed better than those based on mRMR.

The mRMR model had a significant flaw due to its high false positive rate on the heavily fragmented D2 dataset. However, these results indicated that the model is effective in identifying diabetics without prior diagnosis in the NFG population.

The study utilized the weights from the manual curation model with 13 features to identify the key risk factors for diabetic risk. The top five variables identified were ALC, FBG, age, sex, and BMI.

Previous research has indicated that an increase in FBG level within the NFG range increases the risk of diabetes. Notably, these studies reaffirmed age and BMI as well-established risk factors, while also highlighting the difference in diabetes risk between genders. Other important variables included the mean corpuscular volume (MCV) and absolute monocyte count (AMC).

A framework was developed to tailor assessments of diabetic risk to individual patients, based on permutation feature importance (PFI).The case was disseminated for risk factors using the case from an external validation set.

Despite her FBG being within the normal range, this individual’s age, FBP, and BMI were identified as the primary diabetic risk factors. These findings highlight the importance of tailoring interventions to individual risk profiles.

This work culminated in putting this analysis into the DRING web server, which made its practical application easier.

Leave a Reply

Your email address will not be published. Required fields are marked *