OAK

Fusion of Generative AI Techniques and Machine Learning Models to Generate and Investigate Biosignals for Glucose Sensors

Metadata Downloads
Author(s)
Sharma, KirtiSuman, PandeyTiwari, Pawan K.
Type
Article
Citation
ACS Omega, v.10, no.47, pp.57107 - 57122
Issued Date
2025-12
Abstract
The research presents a cutting-edge and an inexpensive technology to predict hematological parameters on the amperometric data set in a hand-held glucometer. The data set contains peak current (Ip in μA), time corresponding to the current (Tp in sec), hematocrit volume (Hv in %), glucose concentration (Gc in mg/dL), and blood viscosity at 12 s–1 (Vis_12 in cP) and 120 s–1 (Vis_120 in cP) shear rates. We deciphered an interconnection between the blood glucose concentration and hemoglobin level through the hematocrit volume of the blood by utilizing machine learning (ML) models. The ML models such as linear regression (LR), support vector regressor (SVR), decision tree (DT), random forest regressor (RFR), extreme gradient boosting regressor model (XGBoost), light gradient boosting regressor (Light GBM), and artificial neural network (ANNs) predicted Gc, Hv, Vis_12, Vis_120, Hgb, and Occ with an acceptable accuracy corroborated through statistical metrics, namely, R-squared (R2) score, mean squared error (MSE), and root-mean squared error (RMSE). The ML models were trained with 80% of the data set and validated with the remaining 20%. Furthermore, the reliability of the models were tested via relative error (RE), K-fold cross-validation technique, and 95% of confidence interval in the domain of predictive analytics. Moreover, five thousand synthetic data sets were generated by utilizing generative artificial intelligence (Gen AI) models such as Generative Adversarial Network (GAN), Variational Auto-Encoder (VAE), and Gaussian copula (Gcop), a multivariate distribution technique. Synthetic data sets were assessed by training the developed machine learning models on the synthetic data set and testing them on the original data set. This approach enabled validation of model performance by comparing the original data with the predicted outputs. The statistical metrics of the models trained and tested on the original data set were compared with the trained data set and tested on the synthetic data set. While XGBoost outperformed other models on the original data set, Light GBM surpassed all models, including XGBoost, on the Gcop-generated data set, making it the most reliable model for synthetic data applications. Our limitations lie toward the viscosity prediction on the Gcop-generated synthetic data set as corroborated through SHAP analysis. Conclusively, we are futuristically propelled to refine the generative process to produce feasible values for the viscosity variables. © 2025 The Authors. Published by American Chemical Society
Publisher
American Chemical Society
ISSN
2470-1343
DOI
10.1021/acsomega.5c05979
URI
https://scholar.gist.ac.kr/handle/local/33516
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.