Profiling Inappropriate Users’ Tweets Using Deep Long Short-Term Memory (LSTM) Neural Network

Abubakar Umar*, Sulaimon A. Bashir**, Laud Charles Ochei***, Ibrahim A. Adeyanju****
*-** Department of Computer Science, Federal University of Technology Minna Nigeria.
*** Robert Gordon University, Aberdeen, UK.
**** Department of Computer Engineering, Federal University, Oye-Ekiti, Nigeria.
Periodicity:December - February'2019
DOI : https://doi.org/10.26634/jpr.5.4.15864

Abstract

In recent times, big Internet companies have come under increased pressure from governments and NGOs to remove inappropriate materials from social media platforms (e.g., Twitter, Facebook, YouTube). A typical example of this problem is the posting of hateful, abusive, and violent tweets on Twitter which has been blamed for inciting hatred, violence and causing societal disturbances. Manual identification of such tweets and the people who post these tweets is very difficult because of the large number of active users and the frequency with which such tweets are posted. Existing approaches for identifying inappropriate tweets have focused on the detection of such tweets without identifying the users who post them. This paper proposes an approach that can automatically identify different types of inappropriate tweets together with the users who post them. The proposed approach is based on a user profiling algorithm that uses a deep Long Short-Term Memory (LSTM) based neural network trained to detect abusive language. With the support of word embedding features learned from the training set, the algorithm is able to classify the tweets of users into different abusive language categories. Thereafter, the user profiling algorithm uses the classes assigned to the tweets of each user to profile each user into different abusive language category. Experiments on the test set show that the deep LSTM-based abusive language detection model reached an accuracy of 89.14% on detecting whether a tweet is bigotry, offensive, racist, extremism-related and neutral. Also, the user profiling algorithm obtained an accuracy of 83.33% in predicting whether a user is a bigot, racist, extremist, uses offensive language and neutral.

Keywords

Twitter, Abusive Language, Tweet Classification, User Profiling Algorithm, Feature Representation, Machine Learning, Deep Learning.

How to Cite this Article?

Umar, A., Bashir, S. A., Ochei, L.C., & Adeyanju, I. A. (2019). Profiling Inappropriate Users' Tweets Using Deep Long Short-Term Memory (LSTM) Neural Network. i-manager’s Journal on Pattern Recognition, 5(4), 27-43. https://doi.org/10.26634/jpr.5.4.15864

References

[1]. Abubakar, U., Bashir, S. A., Abdullahi, M. B., & Adebayo, O. S. (2019). Comparative study of various machine learning algorithms for tweet classification. i-manager's Journal on Computer Science, 6(4), 12-24.
[2]. Agarwal, S., & Sureka, A. (2016). But I Did Not Mean It! Intent Classification of Racist Posts on Tumblr. European Intelligence and Security Infromatics Conference (pp. 124-127). IEEE. doi:10. 17632/hd3b6v659v.2
[3]. Albadi, N., Kurdi, M., & Mishra, S. (2018). Are they our brother? Analysis and detection of religious hate speech in the Arabic twittersphere. International Conference on Advances in Social Networks Analysis and Mining (pp. 69- 76). IEEE.doi:10.1109/ASONAM.2018.8508247
[4]. Alfina, I., Mulia, R., Fanany, M., & Ekanata, Y. (2017). Hate speech detection in the indonesian language: a dataset and preliminary study. International Conference on Advanced Computer Science and Information Systems (ICACSIS) (pp. 233-238). IEEE. doi:10.1109/ ICACSIS.2017.8355039
[5]. Al-Quirishi, M., Aldrees, R., AlRubaian, M., Al- Rakhami, M., Rahman, M. S., & Alamri, A. (2015). A new model for classifying social media users according to their behaviors. World Symposium on Web Applications and Networking (WSWAN) (pp. 1-5). IEEE. doi:10.1109/WSWAN.2015.7209085
[6]. Al-Rfou, R., Alain, G., Almahairi, A., Angermueller, C., Bahdanau, D., Ballas, N.,…..& Zhang, Y., (2016). Theano: A Python framework for fast computation of mathematical expressions. arXiv:1605.02688v1 [cs.SC]
[7]. Aphinyanaphongs, Y., Ray, B., Statnikov, A., & Krebs, P. (2014, August). Text classification for automatic detection of alcohol use-related tweets: A feasibility study. In th Proceedings of the 2014 IEEE 15 International Conference on Information Reuse and Integration (IEEE IRI 2014) (pp. 93-97). IEEE. doi:10.1109/IRI.2014.7051877
[8]. Chikashi, N., Joel, T., Achint, T., Yashar, M., & Yi, C. (2016). Abusive Language Detection in Online User Content. International World Wide Web Conference (pp. 145-153). Association of Computing Machinery (ACM). doi:dx.doi.org/10.1145/ 2872427.2883062
[9]. Cufoglu, A. (2014). User Profiling - A Short Review. International Journal of Computer Applications, 108(3), 1-9. doi:10.5120/18888-0179
[10]. Dey, R., & Salemt, F. M. (2017, August). Gate-variants of Gated Recurrent Unit (GRU) neural networks. In 2017 th IEEE 60 International Midwest Symposium on Circuits and Systems (MWSCAS) (pp. 1597-1600). IEEE. doi:10.1109/MWSCAS.2017.8053243
[11]. Djuric, N., Zhou, J., Morris, R., Grbovic, M., Radosavljevic, V., & Bhamidipati, N. (2015, May). Hate speech detection with comment embeddings. In th Proceedings of the 24 International Conference on World Wide Web (pp. 29-30). ACM. doi:dx.doi.org/10.1145/ 2740908.2742760.
[12]. DL4J (2017, August 12). Deep Learning and Neural Network Glossary. Retrieved March 23, 2018, from https://deeplearning4j.org/cn/glossary
[13]. Fatahillah, N., Suryati, P., & Haryawan, P. (2017). Implementation of Naïve Bayes Classifier Algorithm on Social Media (Twitter) to the Teaching of Indonesian Hate Speech. International Conference on Sustainable Information Engineering and Technology (pp. 128-131). IEEE. doi:10.1109/SIET.2017.8304122
[14]. Ikeda, K., Hattori, G., Ono, C., Asoh, H., & Higashino, T. (2013). Twitter user profiling based on text and community mining for market analysis. Knowledge-Based Systems, 51, 35-47. doi:10.1016/j.knosys.2013.06.020
[15]. Iqbal, M. (2019, February 27). Twitter Revenue and Usage Statistics. Business of Apps, Retrieved March 02, 2019, from http://www.businessofapps.com
[16]. Kang, K., Yoon, C., & Kim, E. Y. (2016, January). Identifying Depressive Users in Twitter Using Multimodal Analysis. In 2016 International Conference on Big Data and Smart Computing (BigComp) (pp. 231-238). IEEE. doi:10.1109/BIGCOMP.2016.7425918
[17]. Keras. (n.d.). The Python Deep Learning Library. Retrieved July 10, 2018, from https://keras.io/
[18]. Lee, W. J., Oh, K. J., Lim, C. G., & Choi, H. J. (2014, February). User profile extraction from Twitter for th personalized news recommendation. In 16 International Conference on Advanced Communication Technology (pp. 779-783). IEEE. doi:10.1109/ICACT.2014.6779068
[19]. Lundeqvist, E., & Svensson, M. (2017). Author Profiling: A Machine Learning approach towards detecting gender, age, and native language of users in social media. UPPSALA University, Department of Information Technology.
[20]. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Represenations in Vector Space. CoRR, abs/1301.3781. doi:https://arxiv.org/abs/ 1301.3781
[21]. Neethu, M. S., & Rajasree, R. (2013, July). Sentiment Analysis in Twitter using Machine Learning Techniques. International Conference on Computing, Communications and Networking Technologies (ICCCNT) (pp. 1-5). IEEE. doi:10.1109/ICCCNT.2013.6726818
[22]. O'Dea, B., Wan, S., Batterham, P. J., Calear, A. L., Paris, C., & Christensen, H. (2015). Detecting suicidality on Twitter. Internet Interventions, 2(2), 183 - 188. doi:10.1016/j.invent.2015.03.005
[23]. Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods In Natural Language Processing (EMNLP) (pp. 1532-1543).
[24]. Pitsilis, K. G., Ramampiaro, H., & Langseth, H. (2018). Detecting Offensive Langauge in Tweets using Deep Learning. New York: Cornell University. arXiv:1801.04433v1 [cs.CL]
[25]. Rehurek, R. (2019, April 10). Topic modelling for humans. Retrieved July 10, 2018, from Gensim: https://radimrehurek.com/gensim/
[26]. Rocha, E., Francisco, P. A., Calado, P., & Sofia-Pinto, H. (2011). User Profiling on Twitter. Semantic Web Journal. Retrieved May 12, 2018, from http://www.semantic-webjournal. net
[27]. Sureka, A., & Agarwal, S. (2014, September). Learning to classify hate and extremism promoting tweets. In 2014 IEEE Joint Intelligence and Security Informatics Conference (pp. 320-320). IEEE. doi:10.1109/JISIC.2014.65
[28]. TensorFlow. (n.d.). An end-to-end open source machine learning platform. Retrieved July 10, 2018, from https://www.tensorflow.org/
[29]. Twitter. (n.d.). Filter Realtime Tweets. Retrieved January 20, 2018, from Twitter Developer : https://developer.twitter.com/en/docs/tweets/filterrealtime/ api-reference/post-statuses-filter.html
[30]. Watanabe, H., Bouazizi, M., & Ohtsuki, T. (2018). Hate speech on twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access, 6, 13825-13835. doi:10.1109/ACCESS.2018.2806394
[31]. Wikarsa, L., & Thahir, S. N. (2015, November). A text mining application of emotion classifications of Twitter's st users using Naïve Bayes method. In 2015 1 International Conference on Wireless and Telematics (ICWT) (pp. 1-6). IEEE. doi:10.1109/ICWT.2015.7449218
[32]. Zhang, Z., He, Q., Gao, J., & Ni, M. (2018). A deep learning approach for detecting traffic accidents from social media data. Transportation Research Part C: Emerging Technologies, 86, 580-596. doi:10.1016/j.trc.2017.11.027
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.