i-manager Publications

Speaker Identification Using K-means Method Based on Mel Frequency Cepstral Coefficients(MFCC)

Dirman Hanafi*, Abdul Syafiq Abdul Sukor**

*-** Department of Mechatronic and Robotic Engineering, Faculty of Electrical and Electronic Engineering, University Tun Hussein Onn Malaysia.

Periodicity:February - April'2012
DOI : https://doi.org/10.26634/jes.1.1.1729

Abstract

The most commonly method use by people to protect their secured data or information is using password or PIN/ID protection. This method require user to authenticate them by entering password that they had already created. However, due to lack of security the data is not secured enough. There are cases of fraud and theft when people can easily know the password. But as time goes by, there is a new technology known as Biometric Identification System. It uses biometric characteristics of an individual that is unique and different from everyone else and therefore can be use to authenticate the user authority access. This paper focused on an implementation of speech recognition as medium security access control to restricted services such as phone banking system, voicemail or access to database services. First, speaker signal will go to pre-treatment process, where it will remove the background noise. Then, features from speech signal will be extracted using Mel Frequency Cepstrum Coefficients (MFCC) method. Then, using Vector Quantization, the features will be matched with the reference speech in database. The real speaker is identified by clustering the speech signal from the tested speaker to codebook of each speaker using K-means algorithm and the speaker with the lowest distortion Euclidean distances is chose as the correct speaker. The main focus of this research is speaker identification, which compared speech signal from unknown speaker to a database of known speaker using text-dependent utterances. From the experimental results shows that the method developed is able to recognize the correct voice source perfectly.

Keywords

Security, biometric characteristic, speech recognition system, MFCC, Vector Quantization, K-means algorithm, Euclidean distances.

How to Cite this Article?

Hanafi,D., and Sukor,A,S,A. (2012). Speaker Identification Using K-Means Method Based On Mel Frequency Cepstral Coefficients (MFCC). i-manager’s Journal on Embedded Systems. 1(1), 19-28. https://doi.org/10.26634/jes.1.1.1729

References

[1]. Andreas, M. (2010). Automatic Speech Recognition Systems for Evaluating Voice and Speech Disorders in Head and Neck Cancer. EURASIP Journal on Audio, Speech and Music Processing, 1-7.

[2]. Ben, G., & Nelson, M. (2001). Speech and Audio Signal Processing: Processing and Perception of Speech and Music. John Wiley and Sons, Ltd

[3]. Chandra, E., & Sunitha, C., (2009). A Review on Speech and Speaker Authentication System Using Voice Signal Feature Selection and Extraction. IEEE International Advance Computing Conference, IACC '09, 1341-1346.

[4]. Cui, B., & Xue, T. (2009). Design and Realization of an Intelligent Access Control System Based on Voice Recognition. ISECS International Colloquium on Computing, 2009. CCCM '09, 229-232

[5]. Ibrahim, P., & Srinivos, Y. R. (2010). Speech Recognition Using HMM with MFCC – An Analysis Using Frequency Spectral Decomposition Techniques. International Journal og Signal & Image Processing, 101 – 110.

[ 6 ] . Judith,A.M.(2000).Voice Biometrics. Communications of the ACM, 49, 66-74.

[7]. Kersarkar, M. P. (2003). Feature Extraction for Speech Recognition, Seminar Report M. Tech, Bombay

[8]. Kotnik, B.V.D., Kocic, Z., & Harvat, B. (2002). Robust MFCC Feature Extraction Algorithm Using Efficient Additive and Convolutional Noise Reduction Procedure. ICSIP Proceeding, Denver, USA, 445-448.

[9]. Lasse, L.M., & Kasper, W.J. (2005). Speaker Recognition. Unpublished dissertation courses.

[10]. Ling, F., & Hansen, L.K. (2002). A New Database for Speaker Recognition. Informatics and Mathematical Modeling.

[11]. Lupu, C., & Lupu, V. (2007). Multimodal Biometric for Access Control in an Intelligent Car. 3rd International Symposium on Computational Intelligence and Intelligent Informatics, ISCII 2007, Morocco, 261-267

[12]. Malcangi, M. (2009). Robust Speaker Authentication Based on Combined Speech and Voiceprint Recognition. Proceedings of AIP Conference.

[13]. Manjot, K.G. (2003). A Viable Technique: Speaker Recognition.

[14]. Mahesh, P. K., & Shanmukha, S. M. N. (2010). Biometric Identification System Based on the Fusion of Palmprint and Speech Signal. International Conference on Signal and Image Processing, 186-190.

[15]. Md. Rashidul, H., Mustafa, J., & Md. Golam, R. (2004). Speaker Identification Using Mel Frequency Cepstral Coefficients. 3rd International Conference on Electrical & Computer Engineering, 565-568.

[16]. Muda, L., Begam, M., & Elamrazuthi, I., (2010). Voice Recognition Algorithms Using Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Time Wrapping (DTW) Techniques. Journal on Computing, 138-143

[17]. Rabiner, L. R., & Juang, B.H. (1993). Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, N.J

[18]. Ross, A., & Jain, A. K. (2001). Information Fusion in Biometrics. Pattern Recognition Letters, 337-346

[19]. Rozeha, A. R., & Mohd Adib, S. (2008). Security System Using Biometric Technology: Design and Implementation of Voice Recognition System. IEEE International Conference on Computer and Communication Engineering, Kuala Lumpur, 898-902.

[20]. Saeed, V. V. (2006). Advanced Digital Signal Processing and Noise Reduction (3rd Ed.). John Wiley and Sons, Ltd.

[21]. Shonda, L. W., & Simon, Y. F. (2003). Optimal Wavelets for Speech Recognition. Systematic, Cybernetics and Informatics, 1-4.

[22]. Thomas, F. Q. (2002). Speech Signal Processing – Principles and Practice. Prentice Hall. PTR, New York.

[23]. Udrea, R.M., & Chiochina, S. (2003). Speech Enhancement Using Spectral Over-Subtraction and Residual Noise Reduction, IEEE Conference, 165-168.

[24]. Yeldener, S., & Rieser, J.H. (2000). A Background Noise Reduction Technique Based on Sinusoidal Speech Coding Systems. IEEE Conference, 1391-1394.

	North Americas,UK, Middle East,Europe		India	Rest of world
	USD	EUR	INR	USD-ROW
Pdf	35	35	200	20
Online	15	15	200	15
Pdf & Online	35	35	400	25

Speaker Identification Using K-means Method Based on Mel Frequency Cepstral Coefficients(MFCC)

Abstract

Keywords

How to Cite this Article?

References

If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Options for accessing this content: