Handling Multiclass Imbalance in Diabetes, Cancer, and Pneumonia Classification Using NR-Clustering SMOTE
DOI:
https://doi.org/10.71129/ijaci.v2i2.pp83-95Keywords:
Multiclass classification , SMOTE, NR-Clustering, Medical diagnosis, Imbalanced dataAbstract
The problem of imbalanced data in multiclass health classification often results in biased model predictions, particularly underrepresenting critical disease classes such as cancer and pneumonia. Traditional oversampling techniques like SMOTE often suffer from issues such as noise generation and class overlap, limiting their effectiveness in such complex domains.This research aims to address the challenge of multiclass imbalance in the classification of diabetes, cancer, and pneumonia by proposing an improved oversampling technique, NR-Clustering SMOTE, which integrates K-Means clustering and Euclidean distance.The proposed method starts by filtering noisy data using k-NN, clusters the minority class data with K-Means (optimized via Silhouette Score), and applies SMOTE within each cluster using Euclidean distance. This ensures localized sample generation, minimizes noise, and reduces class overlapping. The balanced dataset is then evaluated using ten machine learning algorithms, including Extra Trees, Random Forest, and Stacked Ensemble.Experimental results show significant improvements in classification metrics, especially for minority classes. For instance, after oversampling, Extra Trees achieved 89% accuracy and an AUC of 0.97—compared to only 48% and 0.50 on the original dataset.This demonstrates that NR-Clustering SMOTE effectively improves classifier sensitivity toward minority classes without compromising the majority class performance. The improvement is consistent across various models, proving the robustness of the proposed method. In conclusion, NR-Clustering SMOTE with Euclidean distance combined with ensemble classifiers like Extra Trees is a promising solution for handling multiclass imbalanced health data, particularly in domains requiring accurate detection of minority diseases.
Downloads
Published
Abstract
-
31 views
PDF Download
- 17 times
Issue
Section
License
Copyright (c) 2026 Chalvina Izumi Amalia, Nidya Rahmawati (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.


