dc.description.abstract |
Advancements in technology and e-commerce have made credit cards a common payment
method, reducing reliance on cash. However, this shift has also led to a rise in online fraud,
causing significant financial losses. Detecting credit card fraud and predicting credit risk re
mains a challenge for banks and emerging Peer-to-Peer (P2P) lending systems. The growing
demand for credit and the rapid expansion of financial institutions highlight the need for ad
vanced tools to detect fraud, manage risks, streamline operations, and enhance customer ser
vice.
In traditional banking, credit officers assess creditworthiness, but biased data and subjec
tive evaluations make it difficult to distinguish defaulters from non-defaulters. Similarly, P2P
lending faces challenges such as limited borrower information, trust issues, and poor risk as
sessment. Information asymmetry often results in inaccurate default risk estimates. Moreover,
the online nature and high volume of applications in both sectors complicate manual risk as
sessments, leading to inefficiency and herding behavior.
Efficient credit risk prediction is vital to mitigate both credit risk and fraud. While ma
chine learning is widely used for these purposes, standalone models often struggle with large,
complex datasets, non-linear effects, and preserving high-dimensional correlations. These limi
tations hinder their predictive performance.
In this thesis, we developed hybrid machine learning models for credit card fraud detection
and credit risk prediction. The research focuses on three key areas: credit card fraud detection,
credit risk in P2P lending, and credit risk in traditional banking. A Convolutional Neural
Network (CNN) was utilized to extract features from various datasets, transforming them into
one-dimensional arrays for integration with machine learning classifiers such as Support Vector
Machine (SVM), Random Forest (RF), Decision Tree (DT), Gradient Boosting Decision Trees
(GBDT), Logistic Regression (LR), and k-Nearest Neighbors (kNN).
Datasets were sourced from Kaggle, a local Ethiopian bank, and P2P lending platforms. To
address data imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was em
ployed to generate synthetic data points. Model performance was evaluated using metrics such
as accuracy, precision, recall, F1-score, and Area Under the Curve (AUC).
The hybrid CNN-SVM model achieved notable success in fraud detection, with an accuracy
of 91.08%. For credit risk prediction in banks, the hybrid models outperformed traditional
methods, with CNN-SVM achieved 98.60% accuracy. In P2P lending, the CNN-kNN model
reached an accuracy of 97.60%.
These findings demonstrate that hybrid models significantly improve credit risk assessment
and fraud detection capabilities. Their integration into financial institutions’ processes can
help mitigate losses, protect customers, and optimize resource allocation. We recommend that
stakeholders in the financial sector adopt these models while ensuring ongoing evaluation for
effectiveness and compliance with industry standards |
en_US |