A Comparative Analysis of Neural Network Function: Resilient Back Propagation Algorithm (BPA) and Radial Basis Functions (RBF) in Multilingual Environment

Vinay Kumar Jain*
Shri Shankaracharya Technical Campus, Bhilai, Chhattisgarh, India.
Periodicity:January - June'2022
DOI : https://doi.org/10.26634/jdp.10.1.18639

Abstract

The most convenient speech processing tool is Artificial Neural Networks (ANNs). The effectiveness has been tested with various real-time applications. The classifier using artificial neural networks identifies utterances based on features extracted from the speech signal. The proposed approach to multilingual speaker identification consists of two parts, such as a training part and a testing part. In the training part, the classifier is trained using speech feature vectors. The spoken language contains complete information, such as details about the content of the message and details about the speaker of that message. In the present work, the speech signal databases of different speakers in a multilingual environment were recorded in three Indian languages, i.e., Hindi, Marathi, and Rajasthani. The cepstral characteristics of the speech signal were extracted: Mel-Frequency Cepstral Coefficients (MFCC) and Gammatone Frequency Cepstral Coefficients (GFCC). The system is designed for speaker recognition through multilingual speech signals using MFCC, GFCC, and combined functions as acoustic characteristics. Training and testing were performed using the Neural Network (NN) function, robust Backpropagation Algorithm (BPA), and Radial Basis Functions (RBF), and the results were compared. The accuracy of the speaker identification system is 94.89% using BPA and 96.62% using the RBF neural network.

Keywords

Language, MFCC, GFCC, BPA, RBF.

How to Cite this Article?

Jain, V. K. (2022). A Comparative Analysis of Neural Network Function: Resilient Back Propagation Algorithm (BPA) and Radial Basis Functions (RBF) in Multilingual Environment. i-manager’s Journal on Digital Signal Processing, 10(1), 9-16. https://doi.org/10.26634/jdp.10.1.18639

References

[1]. Bourlard, H., Dines, J., Magimai-Doss, M., Garner, P. N., Imseng, D., Motlicek, P., Liang, H., Saheer, L.,& Valente, F. (2011). Current trends in multilingual speech processing. Sadhana, 36(5), 885-915. https://doi.org/10.1007/s12046-011-0050-4
[2]. Burgos, W. (2014). Gammatone and MFCC Features in Speaker Recognition (Doctoral dissertation).
[3]. Kumar, R. C. P., & Suguna, S. (2015). Analysis of Mel based features for audio retrieval. ARPN Journal of Engineering and Applied Sciences, 10(5), 2167-2171.
[4]. Moinuddin, M., & Kanthi, A. N. (2014). Speaker identification based on GFCC using GMM. International Journal of Innovative Research in Advanced Engineering (IJIRAE), 1(8), 224-232.
[5]. Pahwa, A., & Aggarwal, G. (2016). Speech feature extraction for gender recognition. International Journal of Image, Graphics and Signal Processing, 9, 17- 25.https://doi.org/10.5815/ijigsp.2016.09.03
[6]. Rathore, P. S., & Tripathi, N. (2014). Multilingual person identification. International Journal of Engineering Trends and Technology (IJETT), 10(1), 1-3. https://doi.org/10.14445/22315381/IJETT-V10P201
[7]. Sarkar, S., Rao, K. S., Nandi, D., & Kumar, S. S. (2013). Multilingual speaker recognition on Indian languages. In 2013 Annual IEEE India Conference (INDICON), 1-5. https://doi.org/10.1109/INDCON.2013.6726131
[8]. Sharma, S., Shukla, A., & Mishra, P. (2014). Speaker and gender identification on Indian languages using multilingual speech. International Journal of Innovative Science, Engineering, and Technology, 1(4), 522-525.
[9]. Sharma, S., Shukla, A., & Mishra, P. (2014). Speech and language recognition using MFCC and DELTA-MFCC. International Journal of Engineering Trends and Technology (IJETT), 12(9), 449-452. https://doi.org/10.14445/22315381/IJETT-V12P286
[10]. Tripathi, N. (2006). Study of face and speech parameters and identification of their relationship for emotional status recognition. Pt. Ravishankar Shukla University, (pp.172).
[11]. Vimala, C., & Radha, V. (2014). Suitable feature extraction and speech recognition technique for isolated tamil spoken words. International Journal of Computer Science and Information Technologies, 5 (1), 378-383.
[12]. Zhao, X., & Wang, D. (2013). Analyzing noise robustness of MFCC and GFCC features in speaker identification. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 7204-7208. https://doi.org/10.1109/ICASSP.2013.6639061
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.