A Novel Hybrid Approach for Network Intrusion Detection using Extreme Gradient Boosting and Long Short-Term Memory Networks

Rashmi Erandika Ratnayake *, Hakim Usoof **
* Department of Information and Communication Technology, University of Sri Jayewardenepura, Sri Lanka.
** Department of Statistics and Computer Science, University of Peradeniya, Sri Lanka.
Periodicity:December - February'2021
DOI : https://doi.org/10.26634/jcom.8.4.18338

Abstract

Network Intrusion Detection (NID) has become a prominent topic nowadays with the increased use of technology and networks. In this study, a novel hybrid approach for network intrusion detection has been presented using Extreme Gradient Boosting (XGBoost) and Long Short-Term Memory (LSTM) networks. The benchmark NSL-KDD dataset has been used. A number of minimal feature sets were created using XGBoost for feature selection and the effects of using them in an LSTM model for detecting whether or not the network features belong to an attack were studied. It has been observed that XGBoost feature selection could be used to create minimal feature sets with very high feature reduction ratios to use in an LSTM model for NID in order to have a clear understanding of the features the model uses to learn, to achieve shorter training times and a good accuracy value close to that achieved using all the features in the dataset utilizing lower space. The findings of this study can be used for building better NID systems using deep neural networks for real-time NID. Also, they can be utilized to develop a first layer of defense for alerting the users about possible threats in real-time.

Keywords

Deep Learning, Extreme Gradient Boosting, Network Intrusion Detection, Long Short-Term Memory Networks

How to Cite this Article?

Ratnayake, R. E., and Usoof, H. (2021). A Novel Hybrid Approach for Network Intrusion Detection using Extreme Gradient Boosting and Long Short-Term Memory Networks. i-manager's Journal on Computer Science, 8(4), 7-13. https://doi.org/10.26634/jcom.8.4.18338

References

[1]. Asif, M. K., Khan, T. A., Taj, T. A., Naeem, U., & Yakoob, S. (2013, April). Network intrusion detection and its strategic importance. In 2013, IEEE Business Engineering and Industrial Applications Colloquium (BEIAC) (pp. 140-144). IEEE. https://doi.org/10.1109/BEIAC.2013.6560100
[2]. Brownlee, J. (2016, August 31). Feature importance and feature selection with XGBoost in Python. Machine Learning Mastery. Retrieved from https://machinelearning mastery.com/feature-importance-and-feature-selectionwith- xgboost-in-python/
[3]. Buczak, A. L., & Guven, E. (2015). A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys & Tutorials, 18(2), 1153-1176. https://doi.org/10.1109/COMST. 2015.2494502
[4]. Chen, T., & Guestrin, C. (2016, August). XGBoost: A nd scalable tree boosting system. In Proceedings of the 22 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794). https://doi.org/ 10.1145/2939672.2939785
[5]. Dhaliwal, S. S., Nahid, A. A., & Abbas, R. (2018). Effective intrusion detection system using XGBoost. Information, 9(7), 149. https://doi.org/10.3390/info9070149
[6]. Farnaaz, N., & Jabbar, M. A. (2016). Random forest modeling for network intrusion detection system. Procedia Computer Science, 89, 213-217. https://doi.org/10.1016/j. procs.2016.06.047
[7]. Ingre, B., & Yadav, A. (2015, January). Performance analysis of NSL-KDD dataset using ANN. In 2015, International Conference on Signal Processing and Communication Engineering Systems (pp. 92-96). IEEE. https://doi.org/10.1109/SPACES.2015.7058223
[8]. Javaid, A., Niyaz, Q., Sun, W., & Alam, M. (2016). A deep learning approach for network intrusion detection system. EAI Endorsed Transactions on Security and Safety, 3(9), 21-26. https://doi.org/10.4108/eai.3-12-2015.2262516
[9]. Khan, J. A., & Jain, N. (2016). A survey on intrusion detection systems and classification techniques. International Journal of Scientific Research in Science, Engineering and Technology, 2(5), 202-208.
[10]. Kuang, F., Xu, W., & Zhang, S. (2014). A novel hybrid KPCA and SVM with GA model for intrusion detection. Applied Soft Computing, 18, 178-184. https://doi.org/10. 1016/j.asoc.2014.01.028
[11]. Lazaris, A., & Prasanna, V. K. (2019, April). An LSTM framework for modeling network traffic. In 2019, IFIP/IEEE Symposium on Integrated Network and Service Management (IM) (pp. 19-24). IEEE.
[12]. Li, W., Yi, P., Wu, Y., Pan, L., & Li, J. (2014). A new intrusion detection system based on KNN classification algorithm in wireless sensor network. Journal of Electrical and Computer Engineering, 1-8. https://doi.org/10.1155/2014/ 240217
[13]. Rao, K. N., Rao, K. V., & Reddy, P. P. V. G. D. (2019). An intrusion detection model based on deep long short term recurrent neural network. International Journal of Engineering and Advanced Technology (IJEAT), 9(2), 2870- 2875. https://doi.org/10.35940/ijeat.B3640.129219
[14]. Reinstein, I. (2017, October 3). XGBoost a top machine learning method on Kaggle, explained. KDnuggets. Retrieved from https://www.kdnuggets.com/ 2017/10/xgboost-top-machine-learning-method-kaggleexplained. html
[15]. Sheikhan, M., Jadidi, Z., & Farrokhi, A. (2012). Intrusion detection using reduced-size RNN based on feature grouping. Neural Computing and Applications, 21(6), 1185-1190. https://doi.org/10.1007/s00521-010-0487-0
[16]. Staudemeyer, R. C. (2012). The importance of time: Modelling network intrusions with long short-term memory recurrent neural networks (Doctoral Dissertation). University of Western Cape, South Africa.
[17]. Tang, T. A., Mhamdi, L., McLernon, D., Zaidi, S. A. R., & Ghogho, M. (2016, October). Deep learning approach for network intrusion detection in software defined networking. In 2016, International Conference on Wireless Networks and Mobile Communications (WINCOM) (pp. 258-263). IEEE. https://doi.org/10.1109/WINCOM.2016.7777224
[18]. Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A.A. (2012). NSL - KDD Dataset. University of New Brunswick. Retrieved from https://www.unb.ca/cic/datasets/nsl.html
[19]. Wang, S., Zhuo, Q., Yan, H., Li, Q., & Qi, Y. (2019). A network traffic prediction method based on LSTM. ZTE Communications, 17(2), 19-25. https://doi.org/10.12142/ ZTECOM.201902004
[20]. Yin, C., Zhu, Y., Fei, J., & He, X. (2017). A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access, 5, 21954-21961. https://doi.org/10. 1109/ACCESS.2017.2762418
[21]. Zhang, J., & Zulkernine, M. (2005, October). Network intrusion detection using random forests. In Proceedings of 3rd Annual Conference on Privacy, Security and Trust (PST '05).
[22]. Zygmunt, Z. (2016, January 27). What is better: Gradient-boosted trees, or a random forest? FastML. Retrieved from https://fastml.com/what-is-better-gradientboosted- trees-or-random-forest/
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.