HANDLING IMBALANCED DATA ON MULTILEVEL DEPRESSION CLASSIFICATION: CHALLENGES AND SOLUTIONS
Keywords:
Imbalanced Data; Multilevel Depression Classification; ADASYN; Online Social Network.Abstract
This study addresses the challenges posed by imbalanced data in multilevel depression classification by leveraging the Adaptive Synthetic (ADASYN) technique. Subject Matter Experts (SMEs) annotate data collected from X into four categories: None, Mild, Moderate, and Severe. The imbalanced distribution, particularly with a larger group for the None category, prompts the application of ADASYN for effective data augmentation. The research framework encompasses Data Collection, Expert Data Annotation, Text Preprocessing, and Text Representation and Classification. Evaluation metrics, including Recall and F1 score, gauge the model's effectiveness in multilevel depression classification. Results showcase the efficacy of the ADASYN-enhanced model, specifically with XGBoost, demonstrating improved classification accuracy, especially for minority classes. This study contributes valuable insights to the field of multilevel depression classification, emphasizing the effectiveness of ADASYN in managing imbalanced data scenarios and showcasing the applicability of XGBoost in enhancing model performance.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Malaysian Journal of Computer Science

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

