Abstract:
Telecom industry is one of most important source of economy in the country. Telecom fraud is a major concern for telecom operators as well as for governments all over the world. This is mainly because of security threats and economic impacts. Telecom fraud performs by one or more persons who intentionally theft the income from provider or from individual benefits to their own as international survey telecom fraud lost billions of dollars in year.
The researcher selected from 39,680 records from the vast amount of CDR data. After removing inappropriate and excessive data only a total of 31367 datasets are appended for the purpose of accompanying this research. 14 attributes are selected from 31 initial attributes or the fields by using CFS feature selection. It has been pre-processed and prepared in a format suitable for the DM tasks. The study was conducted using MATLAB software version R2015a and five data mining algorithms for classification techniques was used, namely J48, PART, RF,ANN and Hybrid Algorithm. In this study an effort has been made to detect fraudulent calls made using bypass to terminate international calls. Such frauds greatly affect the revenue of telephone operators.
According to this study hybrid algorithm with 10-fold cross validation have highest performance 99.86 % correctly classified; PART algorithm with 10-fold cross validation has been the second rank based on its accuracy 99.81%and the least accurate algorithm was ANN algorithm with Percentage split of 66% for training and 34% for test data occurs 99.56% accuracy. According to processing speed J48 has been fast speed that takes around 1.25 seconds to processed data and ANN takes long time compared to others.J48,RF and Hybrid used medium memory usage but PART used small memory usage where ANN take large memory usages. In general Hybrid, PART and J48 algorithms are highly accurate such 99.86%, 99.81 and 99.74% respectively to detect fraud and non-fraud CDR data mining in MATLAB software.