Precise Detection of Phishing URLS Using Recurrent Neural Networks

Kamireddy Neeharika*, K. P. Ruphaa Sri **, Vishruthi B. ***, M. Suresh Anand ****
*-**** Department of Computer Science and Engineering, Sri Sairam Engineering College, Chennai, Tamil Nadu, India.
Periodicity:March - May'2021
DOI : https://doi.org/10.26634/jcom.9.1.18154

Abstract

Nowadays, phishing attacks can be launched from anywhere in the world at insignificant costs by people with little to no technical skills. As the technical skills and costs associated with deploying phishing attacks decline, there is an unprecedented level of scam that is driving the need for more effective methods of proactively detecting phishing threats. In our proposed work, the use of URLs as input has been explored for machine learning models applied for phishing site prediction. In this way, a feature engineering approach has been compared followed by a random forest classifier against a novel method based on recurrent neural networks. The recurrent neural network approach has been determined which provides an accuracy rate even without the need of manual feature creation.

Keywords

Cyber Security, Phishing, Machine Learning, Website Classification.

How to Cite this Article?

Neeharika, K., Sri, K. P. R., Vishruthi, B., and Anand, M. S. (2021). Precise Detection of Phishing URLS Using Recurrent Neural Networks. i-manager's Journal on Computer Science, 9(1), 21-26. https://doi.org/10.26634/jcom.9.1.18154

References

[1]. Anti-Phishing Working Group. (2016). Phishing activity trends report: 1st Quarter. Retrieved from https://docs. apwg.org/reports/apwg_trends_report_q1_2016.pdf
[2]. Bahnsen, A. C., Bohorquez, E. C., Villegas, S., Vargas, J., & González, F. A. (2017, April). Classifying phishing URLs using recurrent neural networks. In 2017, APWG symposium on electronic crime research (eCrime), (pp. 1-8). IEEE. https://doi.org/10.1109/ECRIME.2017.7945048
[3]. Dhamija, R., Tygar, J. D., & Hearst, M. (2006, April). Why phishing works. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, (pp. 581-590). https://doi.org/10.1145/1124772.1124861
[4]. Dietterich, T. G. (2002, August). Machine learning for sequential data: A review. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), (pp. 15-30). Heidelberg, Berlin: Springer. https://doi. org/10.1007/3-540-70659-3_2
[5]. Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? The Journal of Machine Learning Research, 15(1), 3133-3181.
[6]. Halgaš, L., Agrafiotis, I., & Nurse, J. R. (2019, August). Catching the Phish: Detecting phishing attacks using recurrent neural networks (RNNs). In International Workshop on Information Security Applications, (pp. 219-233). Springer, Cham.
[7]. Lipton, Z. C., Berkowitz, J., & Elkan, C. (2015). A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019.
[8]. Ma, J., Saul, L. K., Savage, S., & Voelker, G. M. (2009, June). Beyond blacklists: Learning to detect malicious web th sites from suspicious URLs. In Proceedings of the 15 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (pp. 1245-1254). https://doi. org/10.1145/1557019.1557153
[9]. Ma, J., Saul, L. K., Savage, S., & Voelker, G. M. (2011). Learning to detect malicious urls. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 1-24. https://doi.org/10.1145/1961189.1961202
[10]. Marchal, S., Saari, K., Singh, N., & Asokan, N. (2016, June). Know your phish: Novel techniques for detecting th phishing sites and their targets. In 2016, IEEE 36 International Conference on Distributed Computing Systems (ICDCS), (pp. 323-333). IEEE. https://doi.org/10.1 109/ICDCS.2016.10
[11]. Roopak, S., & Thomas, T. (2014, August). A novel phishing page detection mechanism using html source code comparison and cosine similarity. In 2014, Fourth International Conference on Advances in Computing and Communications, (pp. 167-170). IEEE. https://doi.org/10.1 109/ICACC.2014.47
[12]. Thakur, T., & Verma, R. (2014, December). Catching classical and hijack-based phishing attacks. In International Conference on Information Systems Security, (pp. 318-337). Cham: Springer. https://doi.org/10.1007/ 978-3-319-13841-1_18
[13]. Vargas, J., Bahnsen, A. C., Villegas, S., & Ingevaldson, D. (2016, June). Knowing your enemies: Leveraging data analysis to expose phishing patterns against a major US financial institution. In 2016, APWG Symposium on Electronic Crime Research (eCrime), (pp. 1-10). IEEE. https://doi.org/10.1109/ECRIME.2016.7487942
[14]. Verma, R., & Dyer, K. (2015, March). On the character of phishing URLs: Accurate and robust statistical learning classifiers. In Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, (pp. 111-122). https://doi.org/10.1145/2699026.2699115
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.