Abstract:
One of the most serious cyberattacks where investigators are concerned about a
solution is phishing. Phishing is a technique used by attackers to entice end
users and steal their private data. To obtain personal information, attackers
deceive Internet users by impersonating a legitimate website. This can also be
accomplished by posing as legitimate companies or businesses in emails.
Phishing successfully exploits several vulnerabilities, and there is no one sizefits-all solution to protect users from all vulnerabilities. To minimize the
harm caused by phishing it is necessary to be discovered as soon as possible.
There are several methods for detecting phishing depending on a whitelist,
black-list, content-based, URL-based, Visual Similarity, And machine-learning.
In this study, was proposed hybrid ensemble approach based on the combination
of Random Forest, AdaBoost and xgboost to detect phishing websites into two
phases. Firstly, were individually performed each model. Secondly, would
combine models and analyze hybrid ensemble model to get the best combination
of ensemble classifiers that works robust on phishing website attacks. The
proposed approach evaluated using an imbalanced dataset, with a higher
percentage of legitimate URLs than phishing URLs. The dataset is used to train
and test each classification model and hybrid ensemble model. The experimental
results show that the proposed hybrid ensemble approach achieved an accuracy
of 95.23%, According to the findings, the proposed hybrid ensemble can detect
phishing websites with high accuracy. In the future, will suggest that ensemble
classifiers can be combined with Deep Learning techniques to create hybrid
models used for phishing website detection.
Keywords: Phishing, Phishing websites, Legitimate, Hybrid ensemble