Abstract:
Withadvancesincomputersandcommunicationtechnology,internetusersandcomputer networks suffer from a number of rapidly increasing attacks (intrusions). These sophisticatedandever-growingnetworkattacksbeginwithanattackerbreakingintothenetwork via a vulnerable host and then initiating further attacks on the local network. To solve this problem, intrusion detection system, in addition to other security tools (such as firewall and anti-viruses), is one of the most important defense tools. There have been many attempts in order to address the challenges of intrusion activities using variety of detection methods. However, low detection rate (accuracy), high false alarms, high processing time, and high trace size (in case of anomaly intrusion detection) are still the main challenges. A reliable intrusion detection system can be build /developed/ using the two well known techniques. Thefirstistheanomalybasedtechnique:-bymodelingthebehaviorofanormal system and any deviation from the normal usage (model) is considered as anomaly. The second is signature or misuse based approach:- to detect known attacks by using attack signatures or patterns. Our main task in this paper is to develop a model for host based intrusion detection system using these two detection approaches. Machine learning algorithms were applied on Australia Defense Force Academy Linux Data set for the anomaly based technique. Features, from the data set, were extracted using N-gram based feature extraction mechanism. In addition, we have configured one of the host-based intrusion detection tool, called open source security, for signature based intrusion detection. Theexperimentalresultshowedthattheperformanceoftheproposedmodelispromising, in terms of detection rate, false positive rate, processing time, and others performance measurement metrics. We applied three machine learning classifiers (Support vector machine (SVM), K-Nearest Neighbor (KNN), and Random Forest (RF)) for both binary (to classify as ’normal’ or ’attack’) and multi-class classification (to classify as ’normal’ or one of the six different attack classes). Comparatively, we have achieved better performance in binary class classification than multi-class classification. As the experimental result, the accuracy of SVM, KNN, and RFwas96.26%with5.1%falsepositiverate(FPR),96.71%with3.28%FPR,and96.86% with 3.9% FPR, respectively. For future work, we recommended to use real time data and other classifiers for better anomaly detection. Integrating other host based IDS tools with the anomaly based approach is also very important for strong intrusion detection and prevention system.