BDU IR

SMART SIM BOX FRAUD PREDICTION USING DATA MINING TECHNIQUES

Show simple item record

dc.contributor.author Yonas, Tadesse
dc.date.accessioned 2024-12-05T07:55:54Z
dc.date.available 2024-12-05T07:55:54Z
dc.date.issued 2024-06
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/16288
dc.description.abstract Telecommunication fraud, also known as telecom fraud, is a type of fraud that involves the use of telecommunication networks to carry out illegal activities. This can include theft of personal information, financial fraud, and identity theft. Premium Rate Service Fraud, International call Revenue Share Fraud and Bypass Fraud are the top three types of Telecom fraud that cause major revenue losses for telecom operators. Subscriber Identity Module Box (SIM box) fraud is one of bypass fraud. The purpose of this fraud is to make international calls at a lower cost than the official tariff, which enables fraudsters to gain profit from the tariff difference. Telecom operators use numerous fraud detection methodologies, Test Call Generators and Fraud Management Systems are common ones. However, fraudsters overcome this detection approach easily. In addition, both approaches didn’t detect fraudulent numbers near to real time. Therefore, the need for more advanced methods is required. The main objective of this research is to support near to real-time decision making in detection of bypass fraudulent activities. To achieve near to real time SIM box detection model, using Call detail record (CDR) analysis are conducted and tree datasets, 1-hour, daily and monthly aggregated data used. Four machine learning algorithms were implemented, which are Gradient Boosted Trees (GBT), Naive Bayes, Artificial Neural Networks (ANN) and Support Vector Machines (SVM). For data analysis CRoss Industry Standard Process for Data Mining (CRISP-DM) methodology are used. Results of the study show that ANN has better performance among the other three algorithms with accuracy of 99.35% and a lesser false-positive on the near to real time detection using 1-hour aggregated dataset. Models experimented with the same data but in different granularity levels shows acceptable performance. The maximum difference between two granularity levels is not more than 0.9%. Keywords: Machine learning, Data Mining, Call Detail Records, prediction en_US
dc.language.iso en_US en_US
dc.subject Computer Science en_US
dc.title SMART SIM BOX FRAUD PREDICTION USING DATA MINING TECHNIQUES en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record