Modified PSOLA-Genetic Algorithm based approach for Voice Re-Construction

Partha Sarathy Banarjee*, **
*Assistant Professor, Department of Computer Science & Engineering, Jaypee University of Engineering & Technology, Guna (MP), India.
** Assistant Professor, Department of Information Technology Jadavpur University, Salt Lake City, Sector III, Kolkata (WB), India.
Periodicity:September - November'2013
DOI : https://doi.org/10.26634/jit.2.4.2538

Abstract

The process by which we try to reconstruct or regenerate a voice sample from a source sample or try to modify a source voice to a desirable voice, we call it as Synthetic voice generation or artificial voice or voice conversion. The basic and conventional remedy to this issue are based on training and applying conversion functions which generally require a suitable amount of pre-stored training data from both the source and the target speaker. The paper deals with a very crucial issue of achieving the required prosody, timber and some other unique voice templates by considerably reducing the dependence on the sample training dataset of voice. We needed to find out a way by which we can have templates of the “to be achieved voice” which are nearly same parametrically. This is achieved by assigning a marker to the target voice sample for training .A proper estimation of the transformation function can be made possible only by the above mentioned data. We can get the process done by pre existing methods. In nut shell what we proposed is that a system by which in the scarce availability of training dataset also we can reach to a considerable amount of closeness of the target voice. Even though there is a disadvantage that to have higher precision and closer resemblance we need to have clear idea of the system of spelling that a language uses.

Keywords

Artificial Voice, Prosody, Timber, Source Voice, Target Voice, Formant Structure

How to Cite this Article?

Banerjee, P. S., and Roy, U. K., (2013). Modified PSOLA-Genetic Algorithm Based Approach For Voice Re-Construction. i-manager’s Journal on Information Technology, 2(4), 1-9. https://doi.org/10.26634/jit.2.4.2538

References

[ 1 ] . Amritha Raghunath, Gunaa Arumugam, Veerapandian Vignesh, Ganapathi Subramanian (2013). Reconstruction of human voice for impersonation Final report, 18 November.
[2]. Sami Lemmetty, Professor Matti Karjalainen (1999). Review of Speech Synthesis Technology, Master's Thesis submitted for official examination for the degree of Master of Science in Espoo on March 30,
[3]. Hui Ye and Steve Young, Voice Conversion for Unknown Speakers.
[4]. Ye. H. and S. Young (2003).” Perceptually Weighted Linear Transformations for Voice Conver- sion”. Eurospeech Geneva.
[5]. Ganvit,Y Lokhandwala, MA and Bhatt, NS (2012). ”Implementation and Overall Performance Evaluation of Voice Morphing based on PSOLA Algorithm”, International Journal of Advanced Engineering Technology.
[6]. Srikanth Mangayyagari and Ravi Sankar “Pitch Conversion Based on Pitch Mark Mapping” Department of Electrical Engineering, University of South Florida, Tampa, FL 33620, USA E.
[7]. Junichi Yamagishi, Christophe Veaux, Simon King, Steve Renals, “Voice banking and reconstruction” Speech synthesis technologies for individuals with vocal disabilities”, at The Centre for speech Technology Research, University of Edinburgh, U.K.
[8]. Yuuki Naniwa, Takaaki Kondo, Kyohei Kamiyama, (2012). “Study on the Artificial Synthesis of Human Voice Using Radial Basis Function Networks”, Advanced Methods, Techniques, and Applications in Modeling and S imu l a t i o n, Pro c e e d i n g s i n I n f o rma t i o n a n d Communications Technology, Volume 4, pp 291-300.
[9]. Patel, R., Connaghan, K., Franco, D., Edsall, E., Forgit, D., Olsen, L., Ramage, L., Tyler, E. & Russell, S. (In press). The Caterpillar: A Novel Reading Passage for Assessment of Motor Speech Disorders, American Journal of Speech Language Pathology.
[10]. Patel, R., Hustad, K. Connaghan, K.P. & Furr, W. Relationship Between Prosody and Intelligibility in Children with Dysarthria, Journal of Medical Speech Language Pathology.
[11]. Connaghan, K. P. & Patel, R. Impact of prosodic strategies on vowel intelligibility in childhood motor speech impairment, Journal of Medical Speech Language Pathology.
[12]. Patel, R., Niziolek, C., Reilly, K. & Guenther, F. (2011). Prosodic Adaptations to Pitch Perturbation in Running Speech, Journal of Speech Language and Hearing Research, 54, 1051-1059.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.