References
[1]. Agarap, A. F. (2018). Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375.
[6]. Canann, S. A., Tristano, J. R., & Staten, M. L. (1998). An approach to combined laplacian and optimization-based smoothing for Triangular, Quadrilateral, and Quad-Dominant meshes. Indian Medical Register (IMR), 1, 479-494.
[11].
Jha, D., Smedsrud, P. H., Riegler, M. A., Halvorsen, P., De Lange, T., Johansen, D., & Johansen, H. D. (2020). Kvasir-seg: A segmented polyp dataset. In MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, South Korea, January 5–8, 2020, Proceedings, Part II 26 (pp. 451-462). Springer International Publishing.
[15].
Le, Q. V., Ngiam, J., Coates, A., Lahiri, A., Prochnow, B., & Ng, A. Y. (2011, June). On optimization methods for deep learning. In Proceedings of the 28th International Conference on International Conference on Machine Learning (pp. 265- 272).
[16]. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
[17].
Li, S., Chen, H., Wang, M., Heidari, A. A., & Mirjalili, S. (2020). Slime mould algorithm: A new method for stochastic optimization. Future Generation Computer Systems, 111, 300-323.
[18].
Li, X., Xiong, H., Li, X., Wu, X., Zhang, X., Liu, J., & Dou, D. (2022). Interpretable deep learning: Interpretation, interpretability, trustworthiness, and beyond. Knowledge and Information Systems, 64(12), 3197-3234.
[19]. Lin, H., & Jegelka, S. (2018). Resnet with one-neuron hidden layers is a universal approximator. Advances in Neural Information Processing Systems, 31.
[22]. O'Shea, K. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.
[28]. Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747.
[29].
Smedsrud, P. H., Thambawita, V., Hicks, S. A., Gjestang, H., Nedrejord, O. O., Næss, E., & Halvorsen, P. (2021). Kvasir- Capsule, a video capsule endoscopy dataset. Scientific Data, 8(1), 142.
[30]. Soydaner, D. (2020). A comparison of optimization algorithms for deep learning. International Journal of Pattern Recognition and Artificial Intelligence, 34(13), 2052013.
[31].
Srivastava, S., Divekar, A. V., Anilkumar, C., Naik, I., Kulkarni, V., & Pattabiraman, V. (2021). Comparative analysis of deep learning image detection algorithms. Journal of Big data, 8(1), 66.
[33].
Ul Rahman, J., Ali, A., Ur Rehman, M., & Kazmi, R. (2020a). A unit softmax with Laplacian smoothing stochastic gradient descent for deep convolutional neural networks. In Intelligent Technologies and Applications: Second International Conference, INTAP 2019, Bahawalpur, Pakistan, November 6–8, 2019, Revised Selected Papers (pp. 162-174). Springer Singapore.
[37].
Wang, X., Wang, S., Zhang, S., Fu, T., Shi, H., & Mei, T. (2018). Support vector guided softmax loss for face recognition.
arXiv preprint arXiv:1812.11317.
[39]. Zhang, Z., & Sabuncu, M. (2018). Generalized cross entropy loss for training deep neural networks with noisy labels.
Advances in Neural Information Processing Systems, 31.
[40]. Zhou, J., Jiang, T., Li, Z., Li, L., & Hong, Q. (2019, September). Deep speaker embedding extraction with channel-wise feature responses and additive supervision softmax loss function. In Interspeech (pp. 2883-2887).
[41]. Zinkevich, M., Weimer, M., Li, L., & Smola, A. (2010). Parallelized stochastic gradient descent. Advances in Neural Information Processing Systems, 23, 1-9.