References
[1]. Abdel-Hamid, O., Deng, L., Yu, D., & Jiang, H. (2013).
Deep segmental neural networks for speech recognition.
In Interspeech, 36(70). 1849-1853.
[2]. Alom, M. Z., Taha, T. M., Yakopcic, C., Westberg, S.,
Sidike, P., Nasrin, M. S., ... & Asari, V. K. (2019). A state-ofthe-
art survey on deep learning theory and architectures.
Electronics, 8(3), 292. https://doi.org/10.3390/electronics8030292
[3]. Bengio, Y. (2009). Learning deep architectures for AI.
Foundations and Trends in Machine Learning, 2(1), 1-127.
http://dx.doi.org/10.1561/2200000006
[4]. Bengio, Y., Courville, A., & Vincent, P. (2013).
Representation learning: A review and new perspectives.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 35(8), 1798-1828. https://doi.org/10.1109/TPAMI.2013.50
[5]. Canziani, A., Paszke, A., & Culurciello, E. (2016). An
analysis of deep neural network models for practical
applications. arXiv preprint arXiv:1605.07678.
https://doi.org/10.48550/arXiv.1605.07678
[6]. Deng, L., Abdel-Hamid, O., & Yu, D. (2013). A deep
convolutional neural network using heterogeneous
pooling for trading acoustic invariance with phonetic
confusion. In 2013 IEEE International Conference on
Acoustics, Speech and Signal Processing, 6669-6673.
https://doi.org/10.1109/ICASSP.2013.6638952
[7]. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep
residual learning for image recognition. In Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition, 770-778.
[8]. He, Y., & Fosler-Lussier, E. (2012). Efficient segmental
conditional random fields for phone recognition. In
Proceedings of the Annual Conference of the
International Speech Communication Association,
1898-1901.
[9]. Karpathy, A., & Fei-Fei, L. (2015). Deep visualsemantic
alignments for generating image descriptions.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, 3128-3137
[10]. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012).
Imagenet classification with deep convolutional neural
networks. In Proceedings of the 25th International
Conference on Neural Information Processing Systems,
1, 1097-1105.
[11]. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep
learning. Nature, 521, 436-444. https://doi.org/10.1038/nature14539
[12]. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013).
Playing atari with deep reinforcement learning. arXiv
preprint arXiv:1312.5602. https://doi.org/10.48550/arXiv.1312.5602
[13]. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A.,
Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015).
Human-level control through deep reinforcement
learning. Nature, 518(7540), 529-533. https://doi.org/10.1038/nature14236
[14]. Schmidhuber, J. (2015). Deep learning in neural
networks: An overview. Neural Networks, 61, 85-
117.https://doi.org/10.1016/j.neunet.2014.09.003
[15]. Simonyan, K., & Zisserman, A. (2014). Very deep
convolutional networks for large-scale image
recognition. arXivpreprint arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556
[16]. Song, W., & Cai, J. (2015). End-to-end deep neural
network for automatic speech recognition. Standford
CS224D Reports, 1-8.
[17]. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., ... & Rabinovich, A. (2015). Going deeper
with convolutions. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition,1-9.
[18]. Tang, H., Wang, W., Gimpel, K., & Livescu, K. (2015).
Discriminative segmental cascades for feature-rich
phone recognition. In 2015 IEEE Workshop on Automatic
Speech Recognition and Understanding (ASRU), 561-568. https://doi.org/10.1109/ASRU.2015.7404845
[19]. Zeiler, M. D.,& Fergus, R. (2013). Visualizing and
understanding convolutional networks. arXiv2013,
arXiv:1311.2901. https://doi.org/10.48550/arXiv.1311.2901
[20]. Zhang, L., Yang, F., Zhang, Y. D., & Zhu, Y. J. (2016).
Road crack detection using deep convolutional neural
network. In 2016 IEEE international conference on image
processing (ICIP), 3708-3712. https://doi.org/10.1109/ICIP.2016.7533052
[21]. Zingade, A. (2018). Autonomous Driving using Deep
Learning and Behavioural Cloning. Retrieved from
https://medium.com/@akarshzingade/autonomousdriving-
using-deep-learning-and-behavioural-cloning-97983a57fe10
[22]. Zweig, G. (2012). Classification and recognition with
direct segment models. In 2012 IEEE International
Conference on Acoustics, Speech and Signal Processing
(ICASSP), 4161-4164. https://doi.org/10.1109/ICASSP.2012.6288835