References
[1]. Hu, Y., & Loizou, P. C. (2006, May). Subjective
comparison of speech enhancement algorithms. In 2006 IEEE International Conference on Acoustics Speech and
Signal Processing Proceedings (Vol. 1). IEEE. https://doi.
org/10.1109/ICASSP.2006.1659980
[2]. Karhila, R., Remes, U., & Kurimo, M. (2013). Noise in
HMM-based speech synthesis adaptation: Analysis,
evaluation methods and experiments. IEEE Journal of
Selected Topics in Signal Processing, 8(2), 285-295.
https://doi.org/10.1109/JSTSP.2013.2278492
[3]. Kinoshita, K., Delcroix, M., Ogawa, A., & Nakatani, T.
(2015). Text-informed speech enhancement with deep
neural networks. In Sixteenth Annual Conference of the
International Speech Communication Association.
[4]. Stan, A., Watts, O., Mamiya, Y., Giurgiu, M., Clark, R. A.,
Yamagishi, J., & King, S. (2013, August). TUNDRA: A
multilingual corpus of found data for TTS research created
with light supervision. In INTERSPEECH (pp. 2331-2335).
[5]. Toda, T., & Tokuda, K. (2007). A speech parameter
generation algorithm considering global variance for
HMM-based speech synthesis. IEICE Transactions on
Information and Systems, 90(5), 816-824.
[6]. Wang, Y., & Wang, D. (2015, April). A deep neural
network for time-domain signal reconstruction. In 2015 IEEE
International Conference on Acoustics, Speech and
Signal Processing (ICASSP) (pp. 4390-4394). IEEE. https://d
oi.org/10.1109/ICASSP.2015.7178800
[7]. Weninger, F., Erdogan, H., Watanabe, S., Vincent, E.,
Roux, J., Hershey, J. R., & Schuller, B. (2015). Speech
enhancement with LSTM recurrent neural networks and
its application to noise-robust ASR: Latent variable
analysis and signal separation. Springer International
Publishing (pp. 91-99).
[8]. Weninger, F., Hershey, J. R., Le Roux, J., & Schuller, B.
(2014, December). Discriminatively trained recurrent
neural networks for single-channel speech separation. In
IEEE Global Conference on Signal and Information
Processing (GlobalSIP) (pp. 577-581). IEEE. https://doi.org
/10.1109/GlobalSIP.2014.7032183
[9]. Xu, Y., Du, J., Dai, L. R., & Lee, C. H. (2015). A regression
approach to speech enhancement based on deep
neural networks. IEEE/ACM Transactions on Audio, Speech,
and Language Processing, 1(23), 7-19. https://doi.org/10.1109%2FTASLP.2014.2364452
[10]. Yamagishi, J., Veaux, C., King, S., &Renals, S. (2012).
Speech synthesis technologies for individuals with vocal
disabilities: Voice banking and reconstruction. Acoustical
Science and Technology, 33(1), 1-5. https://doi.org/10.12
50/ast.33.1
[11]. Alasadi, A. A., Aldhayni, T. H., Deshmukh, R. R.,
Alahmadi, A. H., & Alshebami, A. S. (2020). Efficient feature
extraction algorithms to develop an arabic speech
recognition system. Engineering, Technology & Applied
Science Research, 10(2), 5547-5553. https://doi.org/
10.48084/etasr.3465
[12]. Rabiner, L. (2012). Digital Speech Processing - Lecture
4: Speech PerceptionAuditory Models, Sound Perception
Models, MOS Methods [Presentation]. Department of
Electrical and Computer Engineering, University of
California, Santa Barbara, USA. Retrieved from https://web.
ece.ucsb.edu/Faculty/Rabiner/ece259/digital%20speech
%20processing%20course/lectures_new/Lecture%204_wi
nter_2012.pdf
[13]. Rani, P.M.K., Kumar, D.V., Gubbi, A., & Dattatraya.
(2018). Speaker recognition technique for web browser
using MFCC algorithm and RGB colour detection for
mouse curser movement. International Journal of
Engineering Research & Technology, 6(13), 1-6.
[14]. Vaseghi, S. V. (2007). Multimedia signal processing:
Theory and applications in speech, music and
communications. John Wiley & Sons.
[15]. Naylor, P. A., Gaubitch, N. D., & Habets, E. A. (2010).
Signal-based performance evaluation of dereverberation
algorithms. Journal of Electrical and Computer
Engineering, 2010. https://doi.org/10.1155/2010/127513
[16]. Abeßer, J. (2020). A review of deep learning based
methods for acoustic scene classification. Applied
Sciences, 10(6). https://doi.org/10.3390/app10062020
[17]. Jiang, W., Liu, P., & Wen, F. (2018). Speech magnitude
spectrum reconstruction from MFCCs using deep neural
network. Chinese Journal of Electronics, 27(2), 393-398.
https://doi.org/10.1049/cje.2017.09.018
[18]. NPTEL. (n.d.). Short-Time Fourier transform (STFT)
[presentation]. National Programme on Technology Enhanced Learning. Retrieved from https://nptel.ac.in/
content/storage2/courses/117105145/pdf/Week_5_Lectur
e_Material.pdf
[19]. Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai,
B., ... & Chen, T. (2018). Recent advances in convolutional
neural networks. Pattern Recognition, 77, 354-377.
[20]. Gopika, P., Krishnendu, C. S., Chandana, M. H.,
Ananthakrishnan, S., Sowmya, V., Gopalakrishnan, E. A., &
Soman, K. P. (2020). Single-layer convolution neural network
for cardiac disease classification using electrocardiogram
signals. In Deep Learning for Data Analytics (pp. 21-35).
Academic Press.