An Analysis of Speech Signal for Multiple Languages through Spectrum Analysis

Pooja Yadav*, Vinay Kumar Jain**
* PG Scholar, Electronics and Telecommunication Engineering, SSTC, Bhilai, India.
** Research Scholar, Swami Vivekananda Technical University, Bhilai, India.
Periodicity:December - February'2017
DOI : https://doi.org/10.26634/jpr.3.4.13540

Abstract

The listeners outperform Automatic speech recognition structures in each and every speech reputation task. Modern excessive-tech automated speech recognition systems carry out very well in environments, wherein the speech indicators are reasonably easy. In maximum of the instances, popularity with the aid of machines degrades dramatically with mild adjustment in speech signals or talking environment, for this reason complicated algorithms are used to symbolize this unpredictability. So, the speech can be easily identified. Speech generation gives many possibilities for private identity, which is practical and non-intrusive. Besides that, speech era offers the capability to verify the identity of a person remotely over long distance by using an ordinary phone. In this paper, the authors have proposed a technique to apprehend any words or speech thru the spectrogram analysis. This technique is used to look at the ideas of speaker reputation in multiple languages and apprehend its uses in identification and verification systems and to assess the recognition capability of various voice functions and parameters. To find out the technique, this is appropriate for Automatic Speaker Recognition systems in phrases of reliability and computational efficiency.

Keywords

Speech Recognition, Computational Efficiency, Speaker Recognition MFCC, LPCC

How to Cite this Article?

Yadav, P., and Jain, V. K. (2017). An Analysis of Speech Signal for Multiple Languages through Spectrum Analysis. i-manager’s Journal on Pattern Recognition, 3(4), 22-27. https://doi.org/10.26634/jpr.3.4.13540

References

[1]. Antoniol, G., Rollo, V. F., & Venturi, G. (2005). Linear predictive coding and cepstrum coefficients for mining time variant information from software repositories. In ACM SIGSOFT Software Engineering Notes, 30(4), 1-5. ACM.
[2]. Cowie, R. (2000). Emotional states expressed in speech. ISCA ITRW on Speech and Emotion: Developing a Conceptual Framework for Research, 224-231.
[3]. Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40(1), 5-32.
[4]. Gulzar, T., Singh, A., & Vijay, S. (2014). An Improved Endpoint Detection Algorithm using Bit wise Approach for Isolated, Spoken Paired, and Hindi Hybrid Paired Words. International Journal of Computer Applications, 92(15).
[5]. Gulzar, T., Singh, A., Rajoriya, D. K., & Farooq, N. (2014). A systematic analysis of automatic speech recognition: an overview. Int. J. Curr. Eng. Technol, 4(3), 1664-1675.
[6]. Han, Y., Wang, G. Y., & Yang, Y. (2008). Speech emotion recognition based on MFCC. Journal of Chongqing University of Posts and Telecommunications, 20(5).
[7]. Hasnain, S. K., Maqsood, M., Shazad, M. A., & Bashir, S. (2008). Development of speech recognition systems, TECHNOLOGY FORCES (Technol, forces). Journal of Engineering and Science. 2(1).
[8]. Jankowski, C. R., Vo, H. D., & Lippmann, R. P. (1995). A comparison of signal processing front ends for automatic word recognition. IEEE Transactions on Speech and Audio processing, 3(4), 286-293.
[9]. Juang, B. H., & Furui, S. (2000). Automatic recognition and understanding of spoken language-a first step toward natural human-machine communication. Proceedings of the IEEE, 88(8), 1142-1165.
[10]. Koolagudi, S. G., Reddy, R., Yadav, J., & Rao, K. S. (2011). IITKGP-SEHSC: Hindi speech corpus for emotion analysis. In Devices and Communications (ICDeCom), 2011 International Conference on (pp. 1-5). IEEE.
[11]. Kumar, P., & Chandra, M. (2011). Hybrid of wavelet and MFCC features for speaker verification. In Information and Communication Technologies (WICT), 2011 World Congress on (pp. 1150-1154). IEEE.
[12]. Meng, Y. (2004). Speech recognition on DSP: Algorithm optimization and performance analysis. The Chinese University of Hong Kong, 1-18.
[13]. Mohino-Herranz, I., Gil-Pita, R., Alonso-Diaz, S., & Rosa-Zurera, M. (2014). MFCC based enlargement of the training set for emotion recognition in speech. arXiv preprint arXiv:1403.4777.
[14]. Picone, J.W. (1993). Signal modeling techniques in speech recognition. Proceedings of the IEEE, 81(9), 1215- 1247.
[15]. Saon, G., & Padmanabhan, M. (2001). Data-driven approach to designing compound words for continuous speech recognition. IEEE Transactions on Speech and Audio Processing, 9(4), 327-332.
[16]. Tripathi, S., & Bhatnagar, S. (2012). Speaker Recognition. IEEE Third International Conference on Computer and Communication Technology (ICCCT), Allahabad, pp. 283-287.
[17]. Wakita, H. (1977). Normalization of vowels by vocaltract length and its application to vowel identification. IEEE Transactions on Acoustics, Speech, and Signal Processing, 25(2), 183-192.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Pdf 35 35 200 20
Online 35 35 200 15
Pdf & Online 35 35 400 25

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.