Evaluation of Classification Algorithms for Phishing URL Detection

Ayanfeoluwa Oluwasola Oluyomi*, Oluwafemi Osho**, Maryam Shuaib***
* President, Information System Audit & Control Association (ISACA), Federal University of Technology, Minna, Nigeria.
** Lecturer, Department of Cyber Security Science, Federal University of Technology Minna, Nigeria.
*** Former Special Assistant, ICT Development to the Governer of Nigeria State, Nigeria.
Periodicity:September - November'2018
DOI : https://doi.org/10.26634/jcom.6.3.15698

Abstract

A phishing URL is a web address created with the intent of deceiving users into releasing their personal and private data or downloading malware into the users' systems without their knowledge. Increase in the adoption of the Internet has led to corresponding increase in the number of phishing sites globally. Many classification techniques have been developed for detecting phishing URLs. This paper seeks to evaluate the performances of existing techniques. With dataset obtained from UCI Machine Learning Repository, the algorithms were assessed in terms of Accuracy, Precision, Recall, F-Measure, Receiver Operating Characteristic (ROC) area and Root Mean Squared Error (RMSE). From analysis and comparison with results from related literature, the Random Forest was found to perform best.

Keywords

Internet, WWW, phishing, phishing URL, Classification Algorithms, Random Forest

How to Cite this Article?

Oluyomi, A., Osho, O., Shuaib, M.(2018) Evaluation of Classification Algorithms for Phishing URL Detection, i-manager's Journal on Computer Science 6(3),34-41. https://doi.org/10.26634/jcom.6.3.15698

References

[1]. Abdulhamid, S. M., Shuaib, M., Osho, O., Ismaila, I., & Alhassan, J. K. (2018). Comparative Analysis of Classification Algorithms for Email Spam Detection. International Journal of Computer Network and Information Security, 10(1), 60-67.
[2]. Abu-Nimeh, S., Nappa, D., Wang, X., & Nair, S. (2007, October). A comparison of machine learning techniques for phishing detection. In Proceedings of the Anti-phishing Working Groups 2nd Annual eCrime Researchers Summit (pp. 60-69). ACM.
[3]. Aburrous, M., Hossain, M. A., Dahal, K., & Thabtah, F. (2010a). Intelligent phishing detection system for ebanking using fuzzy data mining. Expert Systems with Applications, 37(12), 7913-7921.
[4]. Aburrous, M., Hossain, M. A., Dahal, K., & Thabtah, F. (2010b, April). Predicting phishing websites using classification mining techniques with experimental case studies. In Information Technology: New Generations (ITNG), 2010 Seventh International Conference on (pp. 176-181). IEEE.
[5]. Aydin, M., & Baykal, N. (2015, September). Feature extraction and classification phishing websites based on URL. In Communications and Network Security (CNS), 2015 IEEE Conference on (pp. 769-770). IEEE.
[6]. Basnet, R. B., Sung, A. H., & Liu, Q. (2012, June). Feature selection for improved phishing detection. In International Conference on Industrial, Engineering and other Applications of Applied Intelligent Systems (pp. 252- 261). Springer, Berlin, Heidelberg.
[7]. Basnet, R. B., Sung, A. H., & Liu, Q. (2014). Learning to detect phishing URLs. International Journal of Research in Engineering and Technology, 3(6), 11-24.
[8]. Dua, D., & Taniskidou, E. K. (2017). Phishing Websites Dataset [Dataset]. UCI Machine Learning Repository. Retrieved from https://archive.ics.uci.edu/ml/datasets/ phishing+websites. [Accessed: 03-May-2018].
[9]. Feroz, M. N., & Mengel, S. (2014, October). Examination of data, rule generation and detection of phishing URLs using online logistic regression. In Big Data (Big Data), 2014 IEEE International (pp. 241-250).
[10]. Feroz, M. N., & Mengel, S. (2015, June). Phishing URL detection using URL ranking. In Big Data (BigData Congress), 2015 IEEE International Congress on (pp. 635- 638). IEEE.
[11]. Fu, A. Y., Wenyin, L., & Deng, X. (2006). Detecting phishing web pages with visual similarity assessment based on earth mover's distance (EMD). IEEE Transactions on Dependable and Secure Computing, 3(4), 301-311.
[12]. Garera, S., Provos, N., Chew, M., & Rubin, A. D. (2007, November). A framework for detection and measurement of phishing attacks. In Proceedings of the 2007 ACM Workshop on Recurring Malcode (pp. 1-8). ACM.
[13]. Gupta, D. R. (2016). Comparison of classification algorithm to detect phishing web pages using feature selection and extraction. International Journal of Research–Granthaalayah, 4(8), 118-135.
[14]. Huang, H., Qian, L., & Wang, Y. (2012). A SVM-based technique to detect phishing URLs. Information Technology Journal, 11(7), 921-925.
[15]. James, J., Sandhya, L., & Thomas, C. (2013, December). Detection of phishing URLs using machine learning techniques. In Control Communication and Computing (ICCC), 2013 International Conference on (pp. 304-309). IEEE.
[16]. Jin-Lee, L., Dong-Hyun, K., & Chang-Hoon, L. (2015). Heuristic-based Approach for Phishing Site Detection Using URL Features. In Third International Conference Journal on Advances in Computing, Electronics and Electrical Technology (pp. 131-135).
[17]. Khonji, M., Iraqi, Y., & Jones, A. (2011, September). Lexical url analysis for discriminating phishing and legitimate websites. In Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (pp. 109-115). ACM.
[18]. Khonji, M., Iraqi, Y., & Jones, A. (2013). Phishing detection: A literature survey. IEEE Communications Surveys & Tutorials, 15(4), 2091-2121.
[19]. Ma, J., Saul, L. K., Savage, S., & Voelker, G. M. (2009, June). Beyond blacklists: Learning to detect malicious web sites from suspicious URLs. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1245-1254). ACM.
[20]. Ma, J., Saul, L. K., Savage, S., & Voelker, G. M. (2011). Learning to detect malicious urls. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 1-24.
[21]. Marchal, S., Saari, K., Singh, N., & Asokan, N. (2016, June). Know your phish: Novel techniques for detecting phishing sites and their targets. In 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS) (pp. 323-333). IEEE.
[22]. Miyamoto, D., Hazeyama, H., & Kadobayashi, Y. (2008). An evaluation of machine learning-based methods for detection of phishing sites. In Advances in Neuro-Information Processing (pp. 539-546). Springer, Berlin, Heidelberg.
[23]. Moghimi, M., & Varjani, A. Y. (2016). New rule-based phishing detection method. Expert Systems with Applications, 53, 231-242.
[24]. Mohammad, R. M., Thabtah, F., & McCluskey, L. (2014a). Intelligent rule-based phishing websites classification. IET Information Security, 8(3), 153-160.
[25]. Mohammad, R. M., Thabtah, F., & McCluskey, L. (2014b). Predicting phishing websites based on self-structuring neural network. Neural Computing and Applications, 25(2), 443-458.
[26]. Pradeepthi, K. V., & Kannan, A. (2014, December). Performance study of classification techniques for phishing URL detection. In Advanced Computing (ICoAC), 2014 Sixth International Conference on (pp. 135-139). IEEE.
[27]. Ramanathan, V., & Wechsler, H. (2012, June). Phishing Website detection using latent Dirichlet allocation and AdaBoost. In Intelligence and Security Informatics (ISI), 2012 IEEE International Conference on (pp. 102-107). IEEE.
[28]. Sahoo, D., Liu, C., & Hoi, S. C. (2017). Malicious URL detection using machine learning: A survey (pp. 1-21) Retrieved from http://arxiv.org/abs/1701.07179
[29]. Weka 3. (2018). Data Mining with Open Source Machine Learning Software in Java. Retrieved from http://www.cs.waikato.ac.nz/ml/weka/ [Accessed: 29- Jun-2018].
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.