Pattern Recognition Approaches in Music Analytics

Makarand Velankar*, Parag Arun Kulkarni**
* Assistant Professor, Department of Information Technology, MKSSS's Cummins College of Engineering and PhD Research Scholar PICT, SPPU Pune, Maharashtra, India.
** Founder, Chief Scientist and CEO, iknowlation Research Labs Pvt. Ltd., Pune, Maharashtra, India.
Periodicity:June - August'2018
DOI : https://doi.org/10.26634/jpr.5.2.14784

Abstract

Content based Music Information Retrieval (MIR) has been a study matter for MIR research group since the inception of the group. Different pattern recognition paradigms are used for the diverse application for content-based music information retrieval. Music is a multidimensional phenomenon posing severe investigation tasks. Diverse tasks such as automatic music transcription, music recommendation, style identification, music classification, emotion modeling etc. requires quantitative and qualitative analysis. In spite of noteworthy efforts, the conclusions revealed shows latency over correctness achieved in different tasks. This paper covers different feature learning techniques used for music data in conventional audio pattern in different digital signal processing domains. Considering the remarkable improvements in results for applications related to speech and image processing using deep learning approach, similar efforts are attempted in the domain of music data analytics. Deep learning applied for music analytics applications are covered along with music adversaries reported. Future directions in conventional and deep learning approach with evaluation criteria for pattern recognition approaches in music analytics are explored.

Keywords

Feature learning, pattern recognition, music analysis, machine learning, deep learning,content-based approach, music information retrieval.

How to Cite this Article?

Velankar, M., and Kulkarni, P. A (2018). Pattern recognition approaches in music analytics. i-manager’s Journal on Pattern Recognition, 5(2), 37-46. https://doi.org/10.26634/jpr.5.2.14784

References

[1]. Apte, S. D. (2013). Speech and Audio Processing. Wiley Publication.
[2]. Aucouturier, J. J., & Pachet, F. (2002). Scaling up music playlist generation. In Multimedia and Expo, 2002. ICME'02. Proceedings. 2002 IEEE International Conference on (Vol. 1, pp. 105-108). IEEE.
[3]. Baratè, A., Haus, G., & Ludovico, L. A. (2005, September). Music analysis and modeling through Petri nets. In International Symposium on Computer Music Modeling and Retrieval (pp. 201-218). Springer, Berlin, Heidelberg.
[4]. Barros, P., Weber, C., & Wermter, S. (2016, July). Learning auditory neural representations for emotion recognition. In Neural Networks (IJCNN), 2016 International Joint Conference on (pp. 921-928). IEEE.
[5]. Bertin-Mahieux, T., Eck, D., & Mandel, M. (2011). Automatic tagging of audio: The state-of-the-art. In Machine Audition: Principles, Algorithms and Systems (pp. 334-352). IGI Global.
[6]. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., ... & Zhang, X. (2016). End to end learning for self-driving cars.arXivpreprint arXiv:1604.07316
[7]. Boulanger-Lewandowski, N., Bengio, Y., & Vincent, P. (2013, November). Audio Chord Recognition with Recurrent Neural Networks. In ISMIR (pp. 335-340).
[8]. Choi, K., Fazekas, G., & Sandler, M. (2016). Explaining deep convolutional neural networks on music classification. arXiv preprint arXiv:1607.02444.
[9]. Choi, K., Fazekas, G., Sandler, M., & Cho, K. (2017, March). Convolutional recurrent neural networks for music classification. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2392-2396). IEEE.
[10]. Dalvi, N., Domingos, P., Sanghai, S., & Verma, D. (2004, August). Adversarial classification. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 99-108). ACM.
[11]. Deng, L., & Yu, D. (2014). Deep learning: Methods and applications. Foundations and Trends® in Signal Processing, 7(3-4), 197-387.
[12]. Dieleman, S., & Schrauwen, B. (2014, May). End-toend learning for music audio. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on (pp. 6964-6968). IEEE.
[13]. Durand, S., Bello, J. P., David, B., & Richard, G. (2016, March). Feature adapted convolutional neural networks for downbeat tracking. In Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on (pp. 296-300). IEEE.
[14]. Eck, D., & Schmidhuber, J. (2002). A first look at music composition using LSTM recurrent neural networks. Istituto Dalle Molle Di Studi Sull Intelligenza Artificiale, 103.
[15]. Ewert, S., Pardo, B., Müller, M., & Plumbley, M. D. (2014). Score-informed source separation for musical audio recordings: An overview. IEEE Signal Processing Magazine, 31(3), 116-124.
[16]. Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572
[17]. Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep Learning (Vol. 1). Cambridge: MIT Press.
[18]. Hamel, P., & Eck, D. (2010, August). Learning Features from Music Audio with Deep Belief Networks. In ISMIR (Vol. 10, pp. 339-344).
[19]. Han, Y., Kim, J., Lee, K., Han, Y., Kim, J., & Lee, K. (2017). Deep convolutional neural networks for predominant instrument recognition in polyphonic music. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 25(1), 208-221.
[20]. Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., ... & Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82-97.
[21]. Kereliuk, C., Sturm, B. L., & Larsen, J. (2015). Deep learning and music adversaries. IEEE Transactions on Multimedia, 17(11), 2059-2071.
[22]. Klinger, R., & Rudolph, G. (2006). Evolutionary composition of music with learned melody evaluation. In Proceedings of the Int. Conference on Computational Intelligence, Man-machine Systems and Cybernetics (CIMMACS'06).
[23]. Lee, H., Pham, P., Largman, Y., & Ng, A. Y. (2009). Unsupervised feature learning for audio classification using convolutional deep belief networks. In Advances in Neural Information Processing Systems (pp. 1096-1104).
[24]. Li, T. L., Chan, A. B., & Chun, A. (2010, March). Automatic musical pattern feature extraction using convolutional neural network. In Proc. Int. Conf. Data Mining and Applications (pp. 546-550).
[25]. Liu, I., & Ramakrishnan, B. (2014). Bach in 2014: Music composition with recurrent neural network. arXiv preprint arXiv:1412.3191
[26]. Lu, R., Wu, K., Duan, Z., & Zhang, C. (2017, March). Deep ranking: Triplet MatchNet for music metric learning. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on (pp. 121-125). IEEE.
[27]. Makarand, V., & Parag, K. (2018). Unified algorithm for melodic music similarity and retrieval in query by Humming. In Intelligent Computing and Information and Communication (pp. 373-381). Springer, Singapore.
[28]. Makarand, V., & Sahasrabuddhe, H. V. (2014, January). Novel approach for music search using music contents and human perception. In Electronic Systems, Signal Processing and Computing Technologies (ICESC), 2014 International Conference on (pp. 1-6). IEEE.
[29]. Mozer, M. C. (1994). Neural network music composition by prediction: Exploring the benefits of psychoacoustic constraints and multi-scale processing. Connection Science, 6(2-3), 247-280.
[30]. Muller, M., Ellis, D. P., Klapuri, A., & Richard, G. (2011). Signal processing for music analysis. IEEE Journal of Selected Topics in Signal Processing, 5(6), 1088-1110.
[31]. Nguyen, A., Yosinski, J., & Clune, J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 427-436).
[32]. Parascandolo, G., Huttunen, H., & Virtanen, T. (2016). Recurrent neural networks for polyphonic sound event detection in real life recordings. arXiv preprint arXiv:1604.00861
[33]. Pons, J., & Serra, X. (2017, March). Designing efficient architectures for modeling temporal features with convolutional neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on (pp. 2472-2476). IEEE.
[34]. Ramesh, V. M. (2008, July). Exploring Data Analysis in Music using tool Praat. In Emerging Trends in Engineering and Technology, 2008. ICETET'08. First International Conference on (pp. 508-509). IEEE.
[35]. Roads, C., & Strawn, J. (1996). The Computer Music Tutorial. MIT Press.
[36]. Rosa, R. L., Rodriguez, D. Z., & Bressan, G. (2015). Music recommendation system based on user's sentiments extracted from social networks. IEEE Transactions on Consumer Electronics, 61(3), 359-367.
[37]. Serra, X. (2014, January). Creating research corpora for the computational study of music: The case of the CompMusic project. In Audio Engineering Society Conference: 53rd International Conference: Semantic Audio (pp. 1-9). Audio Engineering Society.
[38]. Sigtia, S., Benetos, E., & Dixon, S. (2016). An end-toend neural network for polyphonic piano music transcription. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(5), 927-939.
[39]. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[40]. Sturm, B. L. (2014). The state of the art ten years after a state of the art: Future research in music information retrieval. Journal of New Music Research, 43(2), 147-172.
[41]. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
[42]. Ullrich, K., Schlüter, J., & Grill, T. (2014, August). Boundary Detection in Music Structure Analysis using Convolutional Neural Networks. In ISMIR (pp. 417-422).
[43]. Van den Oord, A., Dieleman, S., & Schrauwen, B. (2013). Deep content-based music recommendation. In Advances in Neural Information Processing Systems (pp. 2643-2651).
[44]. Wang, X., & Wang, Y. (2014, November). Improving content-based and hybrid music recommendation using deep learning. In Proceedings of the 22nd ACM International Conference on Multimedia (pp. 627-636). ACM.5
[45]. Xu, Y., Huang, Q., Wang, W., Foster, P., Sigtia, S., Jackson, P. J., & Plumbley, M. D. (2017). Unsupervised feature learning based on deep models for environmental audio tagging. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(6), 1230- 1241.
[46]. Yang, Y. H., & Chen, H. H. (2011). Music Emotion Recognition. CRC Press.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Pdf 35 35 200 20
Online 35 35 200 15
Pdf & Online 35 35 400 25

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.