Search published articles


Showing 2 results for Random Forest

Hediye Shariaty , Fatemeh Bagheri ,
Volume 13, Issue 1 (9-2025)
Abstract

Background: Diabetes is a prevalent condition with no definitive cure, often referred to as a” silent killer.” Diabetes is primarily categorized into three types: Type I, Type II, and gestational diabetes. In Type I diabetes, the body's immune system attacks and damages the insulin-producing cells. Conversely, Type II diabetes, which is more common than Type I, occurs when the body does not respond adequately to the insulin being produced, resulting in elevated blood sugar levels. Effectively treating pre-diabetes can prevent its progression to full-blown diabetes.
Methods: In the present research, a semi-supervised approach is proposed to predict diabetes. Improved missing value imputation (MVI) is achieved by utilizing Gaussian mixture model (GMM) clustering. The proposed classifier integrates GMM with a machine learning algorithm, specifically random forest (RF), thereby inducing a more robust predictive model via the fusion of clustering and classification techniques.
Results: The proposed method achieves an accuracy of 84%, a precision of 82.03%, a recall of 69.75%, and an F1-score of 75.12% base on experiments conducted on the PIMA Indian population.
Conclusion: Employing GMM to fill in missing values provides the advantage of replacing invalid data with the most similar records, thereby enhancing the quality of the dataset. The proposed classifier also exhibits strong predictive capabilities in identifying diabetes. By integrating this combined approach, this study offers an effective method for predicting diabetes, making a significant contribution to healthcare analytics as a whole.

Mina Rahmati , Masoud Arabfard ,
Volume 13, Issue 1 (9-2025)
Abstract

Background: Stroke is a leading cause of disability and mortality worldwide, with ischemic strokes comprising the majority of cases. Despite advances in neuroimaging, there is a pressing need for supplementary diagnostic tools to enhance accuracy. This study explores the application of machine learning (ML) techniques to predict ischemic stroke using RNA-seq data from the GEO database (GSE22255).
Methods: We developed and evaluated various machine learning models, including Random Forest, K-Nearest Neighbors (KNN), and CHAID (Chi-squared Automatic Interaction Detection), based on their accuracy, precision, specificity, and sensitivity. The analysis utilized a dataset comprising 54,676 genes across 40 samples (20 cases and 20 controls). All modeling was conducted using IBM SPSS Modeler version 18.
Results: The models were assessed based on their classification accuracy, performance evaluation scores, and AUC/Gini AUC metrics. The Random Forest model achieved the highest accuracy (96.67% in training, 80% in testing), while the CHAID algorithm provided interpretable results with key variables (TP53, CYP1A1, and CYP2D6) identified. The KNN model exhibited strong performance with notable confidence in its predictions.
Conclusion: This study demonstrates the potential of ML techniques, particularly Random Forest, to enhance stroke diagnosis and provide insights into stroke pathology, offering a novel approach to improving clinical decision-making. However, the study is limited by the small sample size, and future work should focus on validation with larger datasets and integration with other omics data for clinical application.


Page 1 from 1     

© 2025 CC BY-NC 4.0 | Jorjani Biomedicine Journal

Designed & Developed by : Yektaweb