Hybrid Machine Learning Models in Banking and Peer-to-Peer  Lending for Credit Card Fraud and Credit Risk Prediction:  Addressing Data Imbalance with SMOTE

Tamiru, Melese

dc.contributor.author	Tamiru, Melese
dc.date.accessioned	2025-07-31T12:13:03Z
dc.date.available	2025-07-31T12:13:03Z
dc.date.issued	2025-02
dc.identifier.issn	issn
dc.identifier.uri	http://ir.bdu.edu.et/handle/123456789/16825
dc.description.abstract	Advancements in technology and e-commerce have made credit cards a common payment method, reducing reliance on cash. However, this shift has also led to a rise in online fraud, causing significant financial losses. Detecting credit card fraud and predicting credit risk re mains a challenge for banks and emerging Peer-to-Peer (P2P) lending systems. The growing demand for credit and the rapid expansion of financial institutions highlight the need for ad vanced tools to detect fraud, manage risks, streamline operations, and enhance customer ser vice. In traditional banking, credit officers assess creditworthiness, but biased data and subjec tive evaluations make it difficult to distinguish defaulters from non-defaulters. Similarly, P2P lending faces challenges such as limited borrower information, trust issues, and poor risk as sessment. Information asymmetry often results in inaccurate default risk estimates. Moreover, the online nature and high volume of applications in both sectors complicate manual risk as sessments, leading to inefficiency and herding behavior. Efficient credit risk prediction is vital to mitigate both credit risk and fraud. While ma chine learning is widely used for these purposes, standalone models often struggle with large, complex datasets, non-linear effects, and preserving high-dimensional correlations. These limi tations hinder their predictive performance. In this thesis, we developed hybrid machine learning models for credit card fraud detection and credit risk prediction. The research focuses on three key areas: credit card fraud detection, credit risk in P2P lending, and credit risk in traditional banking. A Convolutional Neural Network (CNN) was utilized to extract features from various datasets, transforming them into one-dimensional arrays for integration with machine learning classifiers such as Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), Gradient Boosting Decision Trees (GBDT), Logistic Regression (LR), and k-Nearest Neighbors (kNN). Datasets were sourced from Kaggle, a local Ethiopian bank, and P2P lending platforms. To address data imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was em ployed to generate synthetic data points. Model performance was evaluated using metrics such as accuracy, precision, recall, F1-score, and Area Under the Curve (AUC). The hybrid CNN-SVM model achieved notable success in fraud detection, with an accuracy of 91.08%. For credit risk prediction in banks, the hybrid models outperformed traditional methods, with CNN-SVM achieved 98.60% accuracy. In P2P lending, the CNN-kNN model reached an accuracy of 97.60%. These findings demonstrate that hybrid models significantly improve credit risk assessment and fraud detection capabilities. Their integration into financial institutions’ processes can help mitigate losses, protect customers, and optimize resource allocation. We recommend that stakeholders in the financial sector adopt these models while ensuring ongoing evaluation for effectiveness and compliance with industry standards	en_US
dc.language.iso	en	en_US
dc.subject	Mathematics	en_US
dc.title	Hybrid Machine Learning Models in Banking and Peer-to-Peer Lending for Credit Card Fraud and Credit Risk Prediction: Addressing Data Imbalance with SMOTE	en_US
dc.type	Dissartation	en_US