Development of a Model for Detecting Emotions using CNN and LSTM

Manish Goswami*, Aditya Parate**, Nisarga Kapde***, Shashwat Singh****, Nitiksha Gupta*****, Meena Surjuse******
*-****** Computer Science and Engineering, S. B. Jain Institute of Technology, Management and Research, Nagpur, India.
Periodicity:July - September'2024

Abstract

This paper presents the development of a real-time deep learning system for emotion recognition using both speech and facial inputs. For speech emotion recognition, three significant datasets: SAVEE, Toronto Emotion Speech Set (TESS), and CREMA-D were utilized, comprising over 75,000 samples that represent a spectrum of emotions: Anger, Sadness, Fear, Disgust, Calm, Happiness, Neutral, and Surprise, mapped to numerical labels from 1 to 8. The system identifies emotions from live speech inputs and pre-recorded audio files using a Long Short-Term Memory (LSTM) network, which is particularly effective for sequential data. The LSTM model, trained on the RAVDEES dataset (7,356 audio files), achieved a training accuracy of 83%. For facial emotion recognition, a Convolutional Neural Network (CNN) architecture was employed, using datasets such as FER2013, CK+, AffectNet, and JAFFE. FER2013 includes over 35,000 labeled images representing seven key emotions, while CK+ provides 593 video sequences for precise emotion classification. By integrating LSTM for speech and CNN for facial emotion recognition, the system shows robust capabilities in identifying and classifying emotions across modalities, enabling comprehensive real-time emotion recognition.

Keywords

Emotion Recognition, Short-Term Memory (LSTM), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), RAVDEES, CREMA-D, Toronto Emotion Speech Set (TESS), SAVEE, Extreme Learning Machine (ELM), Support Vector Machine (SVM).

How to Cite this Article?

Goswami, M., Parate, A., Kapde, N., Singh, S., Gupta, N., and Surjuse, M. (2024). Development of a Model for Detecting Emotions using CNN and LSTM. i-manager’s Journal on Software Engineering, 19(1), 17-28.

References

[2]. Balvir, S., Sahu, S., & Rohankar, J. (2021). Approaches and applications of sentiment analysis on users data. Journal of University of Shanghai for Science and Technology, 23(6), 1761-1767.
[3]. Ekman, P., & Keltner, D. (1970). Universal facial expressions of emotion. California Mental Health Research Digest, 8(4), 151-158.
[4]. Ingale, A. B., & Chaudhari, D. S. (2012). Speech emotion recognition. International Journal of Soft Computing and Engineering (IJSCE), 2(1), 235-238.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.