DESIGNING BANK DISTRESS PREDICTION MODEL USING MACHINE  LEARNING ALGORITHMS

TILAHUN, TADESSE TEKLU

DESIGNING BANK DISTRESS PREDICTION MODEL USING MACHINE LEARNING ALGORITHMS

TILAHUN, TADESSE TEKLU

URI: http://ir.bdu.edu.et/handle/123456789/15681

Date: 2023-06

Abstract:

The problem of bank distress in the World banking industry has been a major issue for all the stakeholders, investors in the economy, and also the business world at large. In order to tackle any ensuing conditions of bank collapse, predictive analysis of a bank's financial situation and customer connection is quite beneficial. This study will be conducted to design a bank distress predicting model for the banks. To comply with the research objectives, Secondary sources of data will be used. To predict or forecast bank distress, an efficient Bank Distress Prediction (BDP) model has become necessary. In this regard, a wide range of Machine Learning (ML) models has been developed to predict distress in the banks. But, those BDP models have insufficient performance due to challenges like the presence of redundant, irrelevant features, and imbalance class problems. Imbalanced class occurs with data samples from two groups, the minority group contains considerably smaller samples than the majority group. The imbalanced class nature of the distressed data increases the learning difficulty of the classification algorithms to train the model. The use of imbalanced data leads to off-target predictions of the minority class, but which is considered to be more important than the majority class. These challenges depreciate the performance of the distress prediction model depending on the predictor’s ability to tackle data frauds. In this study, we proposed a bank distress prediction model that addresses imbalance class problems using Feature selection techniques (for selecting the significant features), Synthetic Minority Oversampling Techniques (SMOTE) used to produce balanced data and Random Forest (RF) for classification algorithms. Further, we implement four classifier algorithms Logistic Regression (LR), K-Nearest Neighbors (KNN), Decision Tree (DT), and Support Vector Machine (SVM). We implement Random Forest (RF) on the transformed or resampled dataset. To evaluate the performance of the proposed model, we did experiments on imbalanced datasets of the Polish Bankruptcy dataset from the UCI Machine Learning repository. Hereafter, the proposed model is expected to allow them to anticipate the status of businesses in the future and make decisions accordingly. The Experimental results show that the proposed model makes a very good result, in which 83% prediction accuracy and 78% by Decision Tree accuracy is attained for Polish Bankruptcy datasets. So, we conclude that the proposed model improves the performance of BDP effectively, and provides a brand-new way of dealing with the imbalanced dataset problem. Keywords: Decision Tree, Support Vector Machine, Bank Distress Prediction, Synthetic Minority Oversampling Techniques, Logistic Regression, K-Nearest Neighbors, & Random Forest

Show full item record