A Review on Parkinson's Disease Diagnosis Using Machine Learning Techniques

Chnachal * Megha MISHRA ** Vishnu Kumar Mishra ***

*-*** Department of Computer Science, Shri Shankaracharya Engineering College, Shri Shankaracharya Group of Institutions, Chhattisgarh, India.

Abstract

The decreased production of dopamine in the forebrain is believed to be the underlying cause of Parkinson's disease, a neurodegenerative disorder that affects the nervous system. Parkinson's disease is a chronic and progressive illness that may develop new symptoms over time (Nilashi et al., 2016). This occurs as neurons in the substantia nigra of the brain gradually die. People with Parkinson's disease may find it difficult to perform everyday tasks in the workplace. Although clinical evaluations consider a significant amount of data that includes various aspects, it is not always easy to determine whether a person has PD based on this data alone. Feature selection methods can help address this issue. Various techniques are being researched, developed, and evaluated for diagnosing Parkinson's disease, based on the relevant information. This study provides an overview of the use of machine learning algorithms to predict Parkinson's disease, as well as the various new technologies that have been developed and the accuracy that has been achieved.

Keywords :

PD (Parkinson Disease),
Dopamine,
SVM (Support Vector Machine),
KNN (K Nearest Neighbor),
ANN (Artificial Neural Network).

Introduction

Parkinson's disease affects a significant number of people worldwide, and the central nervous system is its primary target. Those who have been diagnosed with PD may be emotionally and physically taxing on those around them, and may experience symptoms such as depression, difficulty concentrating, painful spasms, and more. Parkinson's disease presents a wide range of clinical manifestations, including both motor and non-motor symptoms. Motor signs include hypophonic speech, stiffness, and resting tremor, while non-motor symptoms include hallucinations, depression, constipation, sleep difficulties, cognitive impairment, and impulse control issues. Non-motor symptoms are often more indicative of the disease than motor signs (Ahlrichs & Lawo, 2013; Rovini et al., 2018; Surathi et al., 2016). Medical professionals often face challenges in determining whether a patient is currently affected by Parkinson's disease or at the risk of developing it (Ene, 2008). To address this challenge, it is necessary to design and implement a computational model that can analyze, compile, and accurately predict whether a patient is likely to acquire Parkinson's disease with a suitable degree of precision. In most cases, people diagnosed with Parkinson's disease experience symptoms classified as vocal impairment, also known as dysphonia. Dysphonia is associated with several measurements, including voicerelated problems, which can be used to evaluate patients at different stages of the disease (Åström & Koker, 2011).

This research presents a survey on predictions of Parkinson's disease (PD) using machine learning and deep learning approaches that have produced effective models, highlighting the potency of these algorithms in terms of the accuracies achieved, as well as the diverse methodologies utilized (Faust et al., 2018).

1. Literature Review

1.1 The Significance of Voice-Related Data

It is generally accepted that speech or voice data is beneficial in diagnosing a person to an extent of 90 percent in recognizing the presence of Parkinson's disease. Individuals with PD typically struggle with their speech, which can be classified into two distinct types: hypophonia and dysarthria. Both hypophonia and dysarthria are symptoms of damage to the central nervous system. Hypophonia refers to a person having a voice that is very faint and feeble, while dysarthria refers to slow or slurred speech that is often difficult to understand. Thus, most doctors who treat Parkinson's disease patients detect dysarthria and attempt to rehabilitate patients with specific treatments to improve their ability to modulate vocal intensity (National Parkinson Foundation, n.d.).

On the basis of the various ML methods, several distinct approaches to the early identification of PD have been documented. Nevertheless, if the diagnosis and classification are not made with sufficient precision in a timely manner, it might lead to the development of additional symptoms. Various ML methods have been used to develop several distinct approaches for early identification of PD using various types of data, including brain data. However, a lack of precise and timely diagnosis and classification may lead to the development of additional symptoms. PD can be diagnosed by analyzing various types of data, including brain MRI pictures, voice data, posture images, data recorded by sensors, and handwriting data, among others. Among these factors, speech or voice data is most useful in accurately identifying PD (Delenclos et al., 2016). Ericsson et al., 2005 have developed a completely automated two-fold technique using 3D photos, which, according to their experiments has demonstrated promising results. Pitch Period Entropy (PPE), a novel measure of dysphonia that was developed by Little et al. (2009) employed a kernel support vector machine to evaluate their approach, which resulted in a classification accuracy of 91%.

Schönwieler et al. (2000) developed an alternative strategy that utilizes voice analysis with an Artificial Neural Network (ANN) and demonstrated good results. However, they noted that cost-effectiveness was a challenge. Ene (2008) proposed a neural network-based strategy that distinguished between healthy individuals and those with Parkinson's Disease (PD) using three different internal procedures.

Gil and Johnson (2009) found that reducing the number of neurons in the hidden layer of the network resulted in poor performance of both the training set and the test set. However, with a higher number of neurons, there was a significant risk of overfitting, even though the training set performed well. After experimentation, they found that 13 neurons in the hidden layer was the optimal number for their model.

Bhattacharya and Bhatia (2010) discovered variance in the ROC curve and observed that the TP and FP rates exhibit variations when the number of CV folds increases.

Åström and Koker (2011) proposed a novel approach of using parallel neural networks and recommended evaluating the results of each neural network using a decision-making process based on pre-determined rules. During the training process, the data that has not been learned by each neural network was collected and added to the training set of the subsequent neural network so that it could learn from the previous neural networks. This improves the accuracy of predictions. Chen et al. (2013) developed their FKNN-based system using a 10-fold cross-validation approach.

The study conducted by Islam et al. (2014) compared various machine learning techniques based on their performance accuracies in identifying Parkinson's disease (PD) in individuals. The researchers suggested that a new classifier could be developed to achieve higher levels of accuracy (Ahmadlou & Adeli, 2010).

Peng et al. (2016) recommended the use of computer aided analysis with imaging data. They utilized the software BrainLab to analyze the pictures, calculate the thickness of the cortex, the volume of grey matter, and the surface area of the cortex in each Region of Interest (ROI), and then presented their findings. The classification performance was significantly enhanced by the use of multilevel ROI-based features.

The proposed method, called Genetic Algorithm-Wavelet Kernel-Extreme Learning, allowed Avci and Dogantekin (2016) to achieve high levels of accuracy.

Prashanth et al. (2016) found that multimodal characteristics may be used to provide an accurate prediction of PD at an earlier stage.

Using the genetic algorithm and Principal Component Analysis (PCA) as feature selection methods, Aich et al. (2019) proposed an innovative method for pattern classification of two categories, such as PD and not PD, which increased productivity and saved time. The method involved applying seven Machine Learning (ML) algorithms for classification.

Using ResNet-50, the Optimum-Path Forest (OPF) classifier, and Bayes with Support Vector Machines (SVM), Passos et al. (2018) was able to reach a 96% identification rate. Gupta et al. (2018) used a new approach with the cuttlefish algorithm, and it was utilized for feature selection. Also, several fitness function approximations were employed to improve the cuttlefish algorithm, which is now known as the Optimized Cuttlefish Algorithm (OCFA).

The study used both decision tree and K-Nearest Neighbor classifiers and achieved an overall accuracy of 94% in identifying PD patients.

Mostafa et al. (2019) proposed the Multiple Feature Evaluation Approach (MFEA) of a multi-agent system for Parkinson's diagnosis. They implemented five Decision Tree, Random Forests, Neural Network, Naive Bayes, and Support Vector Machine, both before and after applying their approach (Liu et al., 2022). The average accuracies obtained were: 10.51% for Decision Tree, 15.22% for Naive Bayes, 9.19% for Neural Network, and 12.75% and 9.13% for Random Forests and SVM, respectively. Table 1 depicts the survey of various methodologies and their performances.

Table 1. Summary of the Survey of Various Methodologies and their Performances

It is essential to keep in mind that, out of all the ML strategies, ANN and SVM classifier are the ones that are utilized by the majority of the recommended algorithms in order to have a fast and accurate prediction. According to the results of the study, we found that the majority of the models utilized voice/speech data in order to perform an accurate diagnosis of the condition. This is also due to the fact that a majority of therapists prefer to think of voice data as an important element.

2. Architecture of ANN

Figure 1 represents the architecture of Artificial Neural network with an input layer, hidden layer(s) and an output layer. Number of hidden layers for each network varies from one another. Every circle in the above network represents a neuron and the inputs and corresponding weights are processed layer by layer (Zeinali & Story, 2017).

Figure 1. Architecture of Artificial Neural Network

2.1 Input Layer

In order to construct the neural network, the data could be in the form of text, picture or an audio file. In general, the input layer is where features of the dataset are stored; in the architecture described, each node of the input layer represents one feature.

2.2 Hidden Layer

Each input feature and its respective weight are given to each hidden layer. The weights of each feature represents the decision or prediction. The hidden layer assists in the feature extraction process by performing complex calculations on the data at each node. The nodes on the first hidden layer receive the product of the input feature and the weight value, which is then passed on as input to the subsequent hidden layers and so on. The problem and dataset play a role in determining the optimal number of hidden layers and the appropriate number of nodes in each hidden layer.

2.3 Output Layer

The activation functions dictate how the nodes in the output layer process the data. Common examples of these functions include tanh, sigmoid, and ReLU. When selecting an activation function, one must consider the dataset type and criteria. The output from the lowest hidden layer is used as input for the output layer, which produces the final output in the desired format.

3. Discussion

The techniques of machine learning have been given a significant role in the medical field. The models that are built by employing ML approaches, in contrast to models generated by traditional methods, exhibit dynamic results as data is input into it. One should keep in mind that extensive and focused study is required in order to gain the knowledge necessary for identifying the ailment. Several new machine learning algorithms and methods are being presented at a rapid rate. Among these, some have shown promising results, while others have successfully proved their use in a variety of contexts. The benefit of using ML-generated models is that the more the amount of data used, the higher the precision values become, and the greater will be the degree of accuracy that can be achieved in predictions.

Conclusion

In this study, we have provided a comprehensive evaluation of machine learning approaches that are used for diagnosing Parkinson's disease by analyzing various types of data. Many researchers have devoted significant efforts in predicting Parkinson's disease using innovative methods. The literature survey summarizes the findings of different studies, which indicate that most of the machine learning techniques employed by the authors performed well. However, it is possible that an even more accurate classifier could be built using a unique neural network architecture combined with a specific strategy. To explore this possibility, we plan to develop an artificial neural network with multiple hidden layers and nodes in the near future and compare the accuracy of different implementations.

References

[1]. Ahlrichs, C., & Lawo, M. (2013). Parkinson's disease motor symptoms in machine learning: A review. Health Informatics: An International Journal (HIIJ), 2(4), 1-18. https://doi.org/10.5121/hiij.2013.2401

[2]. Ahmadlou, M., & Adeli, H. (2010). Enhanced probabilistic neural network with local decision circles: A robust classifier. Integrated Computer-Aided Engineering, 17(3), 197-210.

[3]. Aich, S., Kim, H. C., Hui, K. L., Al-Absi, A. A., & Sain, M. (2019, February). A supervised machine learning approach using different feature selection techniques on voice datasets for prediction of Parkinson's disease. In 2019 21st International Conference on Advanced Communication Technology (ICACT) 7(3), 1116-1121. IEEE. https://doi.org/10.23919/ICACT.2019.8701961

[4]. Al-Fatlawi, A. H., Jabardi, M. H., & Ling, S. H. (2016, July). Efficient diagnosis system for Parkinson's disease using deep belief network. In 2016 IEEE Congress on Evolutionary Computation (CEC) (pp. 1324-1330). IEEE. https://doi.org/10.1109/CEC.2016.7743941

[5]. Åström, F., & Koker, R. (2011). A parallel neural network approach to prediction of Parkinson's disease. Expert Systems with Applications, 38(10), 12470-12474. https://doi.org/10.1016/j.eswa.2011.04.028

[6]. Avci, D., & Dogantekin, A. (2016). An expert diagnosis system for Parkinson disease based on genetic algorithmwavelet kernel-extreme learning machine. Parkinson's Disease, 2016. https://doi.org/10.1155/2016/5264743

[7]. Bhat, S., Acharya, U. R., Hagiwara, Y., Dadmehr, N., & Adeli, H. (2018). Parkinson's disease: Cause factors, measurable indicators, and early diagnosis. Computers in Biology and Medicine, 102, 234-241. https://doi.org/10.1016/j.compbiomed.2018.09.008

[8]. Bhattacharya, I., & Bhatia, M. P. S. (2010). SVM classification to distinguish Parkinson disease patients. In Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India (pp. 1-6). https://doi.org/10.1145/1858378.1858392

[9]. Cauwenberghs, G., & Poggio, T. (2000). Incremental and decremental support vector machine learning. Advances in Neural Information Processing Systems, 13.

[10]. Chen, H. L., Huang, C. C., Yu, X. G., Xu, X., Sun, X., Wang, G., & Wang, S. J. (2013). An efficient diagnosis system for detection of Parkinson's disease using fuzzy knearest neighbor approach. Expert Systems with Applications, 40(1), 263-271. https://doi.org/10.1016/j.eswa.2012.07.014

[11]. Chen, H. L., Wang, G., Ma, C., Cai, Z. N., Liu, W. B., & Wang, S. J. (2016). An efficient hybrid kernel extreme learning machine approach for early diagnosis of Parkinson ׳ s disease. Neurocomputing, 184, 131-144. https://doi.org/10.1016/j.neucom.2015.07.138

[12]. Cho, C. W., Chao, W. H., Lin, S. H., & Chen, Y. Y. (2009). A vision-based analysis system for gait recognition in patients with Parkinson's disease. Expert Systems with Applications, 36(3), 7033-7039. https://doi.org/10.1016/j.eswa.2008.08.076

[13]. Das, R. (2010). A comparison of multiple classification methods for diagnosis of Parkinson disease. Expert Systems with Applications, 37(2), 1568-1572. https://doi.org/10.1016/j.eswa.2009.06.040

[14]. Delenclos, M., Jones, D. R., McLean, P. J., & Uitti, R. J. (2016). Biomarkers in Parkinson's disease: Advances and strategies. Parkinsonism & Related Disorders, 22, S106-S110. https://doi.org/10.1016/j.parkreldis.2015.09.048

[15]. Ene, M. (2008). Neural network-based approach to discriminate healthy people from those with Parkinson's disease. Annals of the University of Craiova-Mathematics and Computer Science Series, 35, 112-116.

[16]. Ericsson, A., Lonsdale, M. N., Astrom, K., Edenbrandt, L., & Friberg, L. (2005). Decision support system for the diagnosis of Parkinson's disease. In Image Analysis: 14th Scandinavian Conference, SCIA 2005, Joensuu, Finland, June 19-22, 2005. Proceedings 14 (pp. 740-749). Springer Berlin Heidelberg. https://doi.org/10.1007/ 11499145_75

[17]. Faust, O., Hagiwara, Y., Hong, T. J., Lih, O. S., & Acharya, U. R. (2018). Deep learning for healthcare applications based on physiological signals: A review. Computer Methods and Programs in Biomedicine, 161, 1-13. https://doi.org/10.1016/j.cmpb.2018.04.005

[18]. Gil, D., & Johnsson, M. (2009). Diagnosing Parkinson by using artificial neural networks and support vector machines. Global Journal of Computer Science and Technology, 9(4), 63-71.

[19]. Gupta, D., Julka, A., Jain, S., Aggarwal, T., Khanna, A., Arunkumar, N., & de Albuquerque, V. H. C. (2018). Optimized cuttlefish algorithm for diagnosis of Parkinson's disease. Cognitive Systems Research, 52, 36-48. https://doi.org/10.1016/j.cogsys.2018.06.006

[20]. Hariharan, M., Polat, K., & Sindhu, R. (2014). A new hybrid intelligent system for accurate detection of Parkinson's disease. Computer Methods and Programs in Biomedicine, 113(3), 904-913. https://doi.org/10.1016/j.cmpb.2014.01.004

[21]. Hirschauer, T. J., Adeli, H., & Buford, J. A. (2015). Computer-aided diagnosis of Parkinson's disease using enhanced probabilistic neural network. Journal of Medical Systems, 39, 1-12. https://doi.org/10.1007/s10916-015-0353-9

[22]. Islam, M. S., Parvez, I., Deng, H., & Goswami, P. (2014, May). Performance comparison of heterogeneous classifiers for detection of Parkinson's disease using voice disorder (dysphonia). In 2014 International Conference on Informatics, Electronics & Vision (ICIEV) (pp. 1-7). IEEE. https://doi.org/10.1109/ICIEV.2014.6850849

[23]. Lima, D. W. C., Ferreira, L. A., Vieira, A. N., Azevedo, L. D. S., Silva, A. P., da Cunha, B. M. C., & Sousa, L. C. A. (2018). Ditos sobre o uso abusivo de álcool e outras drogas: significados e histórias de vida. SMAD, Revista Eletrônica Saúde Mental Álcool e Drogas (Edição em Português), 14(3), 151-158. https://doi.org/10.11606/issn.1806-6976.smad.2018.000396

[24]. Lipton, Z. C., Kale, D., & Wetzel, R. (2016, December). Directly modeling missing data in sequences with rnns: Improved classification of clinical time series. In Machine Learning for Healthcare Conference (pp. 253-270). PMLR.

[25]. Little, M., McSharry, P., Hunter, E., Spielman, J., & Ramig, L. (2009). Suitability of dysphonia measurements for telemonitoring of Parkinson's disease. IEEE Transactions on Biomedical Engineering, 56(4), 1015-1022. https://doi.org/10.1038/npre.2008.2298.1

[26]. Liu, W., Liu, J., Peng, T., Wang, G., Balas, V. E., Geman, O., & Chiu, H. W. (2022). Prediction of Parkinson's disease based on artificial neural networks using speech datasets. Journal of Ambient Intelligence and Humanized Computing, 1-14. https://doi.org/10.1007/s12652-022-03825-w

[27]. Mandal, I., & Sairam, N. (2014). New machinelearning algorithms for prediction of Parkinson's disease. International Journal of Systems Science, 45(3), 647-666. https://doi.org/10.1080/00207721.2012.724114

[28]. Mostafa, S. A., Mustapha, A., Mohammed, M. A., Hamed, R. I., Arunkumar, N., Abd Ghani, M. K., ... & Khaleefah, S. H. (2019). Examining multiple feature evaluation and classification methods for improving the diagnosis of Parkinson's disease. Cognitive Systems Research, 54, 90-99. https://doi.org/10.1016/j.cogsys.2018.12.004

[29]. National Parkinson Foundation. (n.d.). What is Parkinson's? Retrieved from https://www.parkinson.org/understanding-parkinsons/what-is-parkinsons

[30]. Nilashi, M., Ibrahim, O., & Ahani, A. (2016). Accuracy improvement for predicting Parkinson's disease progression. Scientific Reports, 6(1), 34181. https://doi.org/10.1038/srep34181

[31]. Oung, Q. W., Muthusamy, H., Basah, S. N., Lee, H., & Vijean, V. (2018). Empirical wavelet transform based features for classification of Parkinson's disease severity. Journal of Medical Systems, 42, 1-17. https://doi.org/10.1007/s10916-017-0877-2

[32]. Pagan, F. L. (2012). Improving outcomes through early diagnosis of Parkinson's disease. American Journal of Managed Care, 18(7), 176-182.

[33]. Passos, L. A., Pereira, C. R., Rezende, E. R., Carvalho, T. J., Weber, S. A., Hook, C., & Papa, J. P. (2018, May). Parkinson disease identification using residual networks and optimum-path forest. In 2018 IEEE 12th International Symposium on Applied Computational Intelligence and Informatics (SACI) (pp. 325-330). https://doi.org/10.1109/SACI.2018.8441012

[34]. Peng, B., Zhou, Z., Geng, C., Tong, B., Zhou, Z., Zhang, T., & Dai, Y. (2016, October). Computer aided analysis of cognitive disorder in patients with Parkinsonism using machine learning method with multilevel ROI based features. In 2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) (pp. 1792-1796). IEEE. https://doi.org/10.1109/CISP-BMEI.2016.7853008

[35]. Prashanth, R., Roy, S. D., Mandal, P. K., & Ghosh, S. (2016). High-accuracy detection of early Parkinson's disease through multimodal features and machine learning. International Journal of Medical Informatics, 90, 13-21. https://doi.org/10.1016/j.ijmedinf.2016.03.001

[36]. Rovini, E., Maremmani, C., Moschetti, A., Esposito, D., & Cavallo, F. (2018). Comparative motor pre-clinical assessment in Parkinson's disease using supervised machine learning approaches. Annals of Biomedical Engineering, 46, 2057-2068. https://doi.org/10.1007/s10439-018-2104-9

[37]. Sakar, C. O., & Kursun, O. (2010). Telediagnosis of Parkinson's disease using measurements of dysphonia. Journal of Medical Systems, 34, 591-599. https://doi.org/10.1007/s10916-009-9272-y

[38]. Schönweiler, R., Hess, M., Wübbelt, P., & Ptok, M. (2000). Novel approach to acoustical voice analysis using artificial neural networks. Journal of the Association for Research in Otolaryngology, 1, 270-282.

[39]. Sriram, T. V., Rao, M. V., Narayana, G. S., & Kaladhar, D. S. V. G. K. (2015). Diagnosis of Parkinson disease using machine learning and data mining systems from voice dataset. In Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014, 1 (pp. 151-157). Springer International Publishing. https://doi.org/10.1007/978-3-319-11933-5_17

[40]. Surathi, P., Jhunjhunwala, K., Yadav, R., & Pal, P. K. (2016). Research in Parkinson's disease in India: A review. Annals of Indian Academy of Neurology, 19(1), 9-20. https://doi.org/10.4103/0972-2327.167713

[41]. Tsanas, A., Little, M. A., McSharry, P. E., Spielman, J., & Ramig, L. O. (2012). Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease. IEEE Transactions on Biomedical Engineering, 59(5), 1264-1271. https://doi.org/10.1109/TBME.2012.2183367

[42]. Zeinali, Y., & Story, B. A. (2017). Competitive probabilistic neural network. Integrated Computer-Aided Engineering, 24(2), 105-118.