Abstract:
Education plays a crucial role in shaping a country‟s development, with educational institutions striving to create quality learning environments that enhance student performance. However, predicting student performance remains a challenge due to the complex and multifaceted factors that influence academic outcomes. Current models often fail to provide sufficient accuracy and robustness in multiclass predictions, highlighting the need for more effective approaches. This study addresses this gap by developing an advanced predictive framework that accurately classifies student performance, facilitating early interventions, personalized learning, efficient resource allocation, and data-driven decision-making at individual, institutional, and policy levels. A unique dataset was collected from three primary schools, encompassing demographic, behavioral, and academic information. To optimize model accuracy, various hyperparameter optimization techniques were applied alongside the Synthetic Minority Over-sampling Technique (SMOTE) to tackle class imbalance. Six machine learning algorithms Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Naïve Bayes (NB), and Logistic Regression (LR) were utilized to build predictive models. Our findings demonstrate that the Random Forest algorithm achieved the highest accuracy of 99% and an AUC of 100%, underscoring its effectiveness in identifying students' performance categories.
Keywords: - Education; Student performance; Machine Learning; predictive models; SMOTE; Hyper-parameter Optimization