Leveraging the Power of Hybrid Machine Learning Algorithms to Predict Cardiovascular Diseases - A Review

Anuradha P. *  Vasantha Kalyani David **
* Research Scholar, Department of Computer Science, Avinashilingam Institute for Home Science and Higher Education for Women, Tamil Nadu, India.
** Professor, Department of Computer Science, Avinashilingam Institute for Home Science and Higher Education for Women, Tamil Nadu, India.

Abstract

As people are becoming more health conscious, preventive health care is gaining importance over diagnostic health care. The goal of future medicine is to provide personalized medical care. According to World Health Organization (WHO), 31% of all global deaths are due to Cardiovascular Diseases (CVDs). In order to prevent heart diseases, the unexplored hidden information in the health care data can be efficiently obtained by applying hybrid Machine Learning Algorithms. These algorithms would help the medical practitioners to gain insight into higher dimensional data, thereby assisting them to predict cardiac arrests even before it occurs. This would enhance medical care and reduce costs for patients. This paper surveys and highlights on the suitable statistical and hybrid Machine Learning Algorithms used for feature selection, prediction, and performance evaluation.

Keywords :

Introduction

Cardiovascular Disease, includes Coronary Heart Disease (CHD), Cerebro Vascular Disease (stroke), Hypertensive Heart Disease (HHD), Congenital Heart Disease, Peripheral Artery Disease (PAD), Rheumatic Heart Disease (RHD), and Inflammatory Heart Disease (Srinivas et al., 2010b).

The risk factors that lead to heart diseases are: Gender, age, a family history of heart disease, being postmenopausal in case of females, lifestyle, Smoking, High Low Density Lipoprotein (LDL) or "bad" cholesterol, and low High Density Lipoprotein (HDL) or "good" cholesterol, hypertension (high blood pressure), Physical inactivity, Obesity, Uncontrolled diabetes, Uncontrolled stress, and anger.

Health Care Organizations (HCOs) are slowly moving out from the one-size-fits-all approach and are striving to customize services for individual needs. This trend towards personalized medicine produces unprecedented amounts of data and most of the biomedical data is in dimensions much higher than 3, making manual analysis difficult and often impossible. Medical experts are finding it difficult to deal with the complexity of such data. Moreover, they are not interested in the data rather they need knowledge and insight from the data which would be of great support to them. Consequently, efficient computational methods, algorithms and tools to discover knowledge and interactively gain insight into highdimensional data were developed in the field of Computer Science (Ahmed and Hannan, 2012; Shouman et al., 2012).

1. Machine Learning

Machine Learning (ML) is programming computers to optimize a performance criterion using example data or past experience. In ML, a model is defined up to some parameters, and learning is the execution of a computer program to optimize the parameters of the model using the training data or past experience. The model may be predictive to make predictions in the future, or descriptive to gain knowledge from data, or both (Alpaydin, 2014).

Machine learning algorithms can be classified into two groups:

1.1 Supervised Learning Algorithm

A supervised learning algorithm takes a known set of input data and corresponding output data, trains a model which would predict the response to the new input data.

Supervised learning algorithm uses classification and regression techniques to develop predictive models. Classification techniques predict discrete responses and Regression techniques predict continuous responses (Machine Learning with MATLAB, n.d.).

Common algorithms for performing classification, include Support Vector Machine (SVM), Boosted and Bagged Decision Trees, K-Nearest Neighbour (KNN), Naive Bayes, Discriminant Analysis, Logistic Regression, and Neural Networks. Common algorithms for performing Regression, include Linear model, Non-Linear model, Regularization, Stepwise Regression, Boosted and Bagged Decision Trees, Neural Networks, and Adaptive Neuro-Fuzzy learning.

1.2 Unsupervised Learning Algorithm

Unsupervised Learning finds hidden patterns or intrinsic structures in data. It is used to draw inferences from datasets consisting of input data without labelled responses. Clustering is the most common unsupervised learning technique. It is used for exploratory data analysis to find hidden patterns or groupings in data. Common algorithms for performing clustering, include K-Means, Hierarchical Clustering, Gaussian Mixture Models, Hidden Markov Models, Self-Organizing Maps, Fuzzy C-Means Clustering, and Subtractive Clustering (Machine Learning with MATLAB, n.d).

Support Vector Machine (SVM) consists of a single or a set of hyperplanes in a high-dimensional space. These hyperplanes perform the task of classification.

The DT classifier divides the complicated process of decision-making into a simpler one by generating a binary tree-like structure as an output that can be analysed easily. A Decision Tree recursively divides the training set until each part is composed of dominant samples from one class.

K-Nearest Neighbours (KNN) is an instance based classifier and one of the simplest of all classification algorithms. By relating the unknown to a known sample, an unknown sample classification is performed. A majority vote by neighbours discriminate the test sample. The most common among them is K-Nearest Neighbours, which are used to assign the class to the test sample (Fujita et al., 2016).

2. Feature Selection

Feature Selection (FS) selects the subset of variables from the original data without changing it.

The features are input to the classifiers to find the minimum number of features needed to obtain the highest performance. SVM offers the advantage of sparsity and inherently selects the most predictive features. Other feature selection methods are broadly classified into filter methods (e.g. Correlation Coefficient Scores, Chi Square Tests, Information Gain), wrapper methods (e.g. Step-Wise Covariate modelling, Recursive Feature Elimination), and embedded methods (Regularization Algorithms, LASSO, Elastic Net, and Ridge Regression) (Bisaso et al., 2017; Saeys et al., 2007; Guyon and Elisseeff, 2003).

Arabasadi et al. (2017) have considered four famous ranking methods for feature selection, namely Gini Index, weight by SVM, Information Gain, and Principal Component Analysis (PCA).

Inbarani et al. (2014) have framed new supervised feature selection methods based on Hybridization of Particle Swarm Optimization (PSO), PSO based Relative Reduct (PSO-RR), and PSO based Quick Reduct (PSO-QR). The experimental result on several standard medical datasets proved the efficiency of this proposed technique over the existing feature selection techniques.

Chaurasia (2017) shows that the most important attributes for heart diseases are CP (Chest Pain), slope (The slope of the peak exercise segment), Exang (Exercise Induced Angina), and Rest-ECG (Resting Electro Cardio Graphic). Chi-square test, Info Gain test, and Gain Ratio test were used for the assessment of input variables and these attributes were found to be the important ones.

3. Experimental Evaluation

Three performance metrics, namely Accuracy, Kappa Statistics, and Root Mean Square Error (RMSE) are used to evaluate the efficiency of the prediction model. The performance of the model is evaluated and comparison of accuracies is done to filter the most promising algorithm.

Similarly, the performance of the model is evaluated by drawing an ROC curve, i.e., the curve of sensitivity versus specificity. Sensitivity is the probability that the result of the test record will be positive if a disease exists, usually known as true positive results, whereas specificity shows the probability that the result of a test record will be negative if the disease does not exist, usually known as true negative values. Both specificity and sensitivity are stated as percentage values.

4. Dataset Description

Most of the published works mentioned in the literature review section had used any or all the three datasets taken from the UCI machine learning repository, namely, Cleveland, Hungarian, and Switzerland. All the three datasets have the same attributes but different number of instances. Table 1 shows the Dataset characteristics.

Table 1. Dataset Characteristics

Cleveland database contains 76 attributes, but most of the previous research works refer to using 14 of them. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1, 2 ,3, 4) from absence (value 0) (UCI Machine Learning Repository-Heart Disease Dataset). Class distributions are 54% heart disease absent, 46% heart disease present. In Hungarian dataset, class distributions are 62.5% heart disease absent and 37.5% heart disease present. In Switzerland dataset, class distributions are 6.5% heart disease absent and 93.5% heart disease present. Dataset description of the 14 mostly used attributes is given below (UCI Machine Learning Repository).

1. age: age in years
2. sex: sex (1 = male; 0 = female)
3. cp: chest pain type 1: typical angina, 2: atypical angina,
3: non-anginal pain, 4: asymptomatic
4. trestbps: resting blood pressure (in mm Hg on admission to the hospital)
5. chol: serum cholesterol in mg/dl
6. fbs: (fasting blood sugar> 120 mg/dl) (1 = true; 0 = false)
7. restecg: resting electrocardiographic results -- 0: normal, 1: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV), 2: showing probable or definite left ventricular hypertrophy by Estes' criteria
8. thalach: maximum heart rate achieved
9. exang: exercise induced angina (1 = yes; 0 = no)
10. oldpeak: ST depression induced by exercise relative to rest
11. slope: the slope of the peak exercise ST segment -- Value 1: upsloping, 2: flat, 3: downsloping
12. ca: number of major vessels (0-3) colored by fluoroscopy
13. thal: 3 = normal; 6 = fixed defect; 7 = reversible defect
14. num: diagnosis of heart disease (angiographic disease status)
(the predicted attribute)
-- Value 0: < 50% diameter narrowing
-- Value 1: > 50% diameter narrowing

5. Literature Review

5.1 K-Nearest Neighbours and Genetic Algorithm

Jabbar et al. (2013) have proposed a new algorithm which combines KNN with genetic algorithm for effective classification. Genetic algorithms perform global search in complex large and multimodal landscapes and provide an optimal solution. Average accuracy of the proposed approach is 5% higher than KNN approach without GA.

5.2 Artificial Neural Networks (ANN) and Genetic Algorithm

Amin et al. (2013) have proposed a hybrid model of Artificial Neural Networks (ANN) and Genetic Algorithm (GA), which optimizes the connection weights of ANN, thereby improving the performance of the Artificial Neural Network. Data on Risk factors of 50 patients were analysed and the results obtained showed training accuracy of 96.2% and a validation accuracy of 89%.

Arabasadi et al. (2017) in their study have used a dataset containing information on 303 patients, out of which 216 had suffered from Coronary Artery Disease. For feature selection, weight by SVM was used and the features with weights more than 0.20 were selected. The initial weights of neural network were identified via Genetic Algorithm. Then the Neural Network was trained. The proposed method, Genetic Algorithm with Neural Networks had a much better performance with an accuracy of 93.85% compared to the model with only Neural Network with an accuracy of 84.62%.

5.3 Artificial Neural Networks (ANN) and Fuzzy Logic

Parthiban and Subramanian (2008) have proposed the Coactive Neuro-Fuzzy Inference System (CANFIS) model combined with the neural network adaptive capabilities and the fuzzy logic qualitative approach, which is then integrated with genetic algorithm to diagnose the presence of heart disease. GA was used to auto-tune the CANFIS parameters and select the optimal feature set.

5.4 Mamdani Fuzzy Inference System

Srinivas et al. (2010b) have used attributes to generate the fuzzy rules that are then weighted based on the frequency in the learning datasets. These weighted fuzzy rules were used to build the clinical decision support system using Mamdani fuzzy inference system. The experimental results on the UCI machine learning repository ensured that the proposed system's performance was better than the neural network based system.

5.5 Rough Set Theory and Fuzzy Classifier

Srinivas et al. (2014) have presented a rough-fuzzy classifier, in which rule generation is done by using rough set theory and prediction is done using fuzzy classifier. The experimentation was carried out using the Cleveland, Hungarian, and Switzerland datasets. The proposed rough-fuzzy classifier outperformed the previous approach by achieving an accuracy of 80% in Switzerland and 42% in Hungarian Datasets.

5.6 Genetic Algorithm and Modified Dynamic Multiswarm Particle Swarm Optimization

Paul et al. (2017) have developed an automatic Fuzzy diagnostic System (FS) based on Genetic Algorithm (GA) and a Modified Dynamic Multi-Swarm Particle Swarm Optimization (MDMS-PSO) for predicting the risk level of heart disease. Here, the effective attributes are selected through statistical methods, such as Correlation coefficient, R-Squared, and Weighted Least Squared (WLS) method. The weighted fuzzy rules are formed on the basis of selected attributes using GA and then MDMS-PSO is used for the optimization of membership functions. Finally the ensemble FS is built from the generated fuzzy knowledge base by fusing the different local FSs. The proposed method achieved 92.31% accuracy for the Cleveland dataset, 95.56% for Hungarian dataset, and 89.47% for Switzerland dataset, thereby outperforming other classifiers taken into comparison Paul et al. (2017)

5.7 Genetic Algorithm and Particle Swarm Optimization with SVM Classifier

Iftikhar et al. (2017) have proposed a hybrid method where GA and PSO algorithm are used to select less, but discriminative features in order to significantly improve SVM classification accuracy. Here, the reduced search space for identifying the best solution during heart disease data analysis in turn reduced the consumption of computing resources.

5.8 Decision Tree and Fuzzy Expert System

Mahmoodabadi and Tabrizi (2015) have proposed a fuzzy expert system based on Imperialist Competitive Algorithm (ICA) to classify heart disease data. Decision tree was used to determine the important attributes in order to obtain valid rules. Using ICA, Membership function optimisation was performed. ICA was found to have an accuracy of 94.92%, which proves to be more efficient than PSO, especially in terms of speed of convergence factor.

5.9 Decision Tree and Genetic Algorithm

Soni et al. (2011) have stated that Decision Tree (DT) outperforms other classification algorithms except Bayesian classification. After applying Genetic Algorithm to reduce the actual data size to an optimal subset of attribute, the accuracy of the Decision Tree and Bayesian Classification had further improved.

Patel et al. (2013) have proposed a hybrid model with Genetic Algorithm and Decision Tree to predict heart diseases. In this model, fourteen attributes are reduced to six attributes by using Genetic Algorithm. The Decision Tree technique outperforms Naive Bayes and Classification via Clustering with an accuracy of 99.2%.

Shouman et al. (2011) have proposed that nine voting with equal frequency discretization and Gain Ratio Decision Tree can enhance the accuracy in diagnosing the heart disease. This model yields an accuracy of 84.1% compared to the Bagging algorithm by Tu et al. (2009), which had an accuracy of 81.41%.

5.10 K-Nearest Neighbours and Apriori Associative Rules

Singh et al. (2016) have proposed a hybrid technique for Classification Associative Rules (CARs). The comparative results show that IBk (k Nearest Neighbour) with Apriori Associative Algorithms produces better results than others with a prediction accuracy of 99.19%.

5.11 Rough Sets based Attribute Reduction with Interval Type-2 Fuzzy Logic System

Long et al. (2015) have proposed a combination of Rough Sets based attribute reduction with Interval type-2 Fuzzy Logic system for heart disease diagnosis. The Rough Sets based attribute reduction using Chaos Firefly Algorithm can efficiently find minimal attribute reduction from highdimensional dataset that enhances the performance of the Classification system.

5.12 One Dependency Augmented Naïve Bayes Classifier and Naïve Credal Classifier 2 for Data Preprocessing

Srinivas et al. (2010a) have explained that for data preprocessing and effective decision making, One Dependency Augmented Naïve Bayes classifier (ODANB) and Naive Credal Classifier 2 (NCC2) could be effectively used. This is an extension of Naïve Bayes to imprecise probabilities that aims at delivering robust classifications also when dealing with small or incomplete data sets.

5.13 Conventional Logistic Regression/h3>

Austin et al. (2013) have found that modern flexible treebased methods like Boosted trees, Bagged trees, and Random Forests offer substantial improvement in prediction and classification of Heart Failure (HF) subtype compared to conventional classification and regression trees. The data is population-based sample of patients from Ontario, Canada. However, conventional logistic regression was able to predict the probability of the presence of Heart Failure with Preserved Ejection Fraction (HFPEF) among patients with HF more accurately compared to the flexible tree-based methods.

5.14 Deep Belief Network

Kim et al. (2017) have proposed a statistical DBN-based prediction model. The dataset used is from sixth Korea National Health and Nutrition Examination Survey (KNHANES-VI) 2013. In this model, statistical analysis was performed to find variables related to cardiovascular disease. Then learning based on the Deep Belief Network (DBN) was developed. The model showed accuracy and an ROC curve of 83.9% and 0.790, respectively. The proposed statistical DBN performed better than other prediction algorithms.

5.15 Comparing Classifiers

Chaurasia (2017) has compared three classifiers, such as Iterative Dichotomiser 3 (ID3), Classification and Regression Trees (CART), and DT in the diagnosis of patients with heart diseases. The observation shows that CART outperformed other two classification methods with an accuracy of 83.49% and the total time taken to build the model is 0.23 seconds. CART classifier has the lowest average error at 0.3 compared to others. The empirical results show that we can produce a short, but an accurate prediction list for the heart patients by applying the predictive models to the records of incoming new patients. This study also works to identify those patients who need special attention.

Masethe and Masethe (2014) have compiled the data set with the data collected from medical practitioners in South Africa. The paper proves that the best classification techniques are J48, REPTree, and Simple CART algorithm, which performs similarly with an accuracy of 99.07%.

Also it shows that Bayes Net algorithm with an accuracy of 98.148% outperformed the Naïve Bayes algorithm, which gave an accuracy of 97.22%.

5.16 Satistical Analysis based Hear t Disease Recommender Model

Mustaqeem et al. (2017) have proposed a system that can aid physicians and patients in taking quick clinical decisions regarding disease diagnosis and medical treatments. It is a hybrid of two models, i.e., Heart Disease Prediction Model (HD_PM) and Statistical Analysis based Heart Disease Recommender Model (SAbHD_RM). The dataset is collected from a known hospital with the consent of patients for a multi-class classification of heart diseases and consists of a total of 1070 records, including patients suffering from critical, moderate, and normal severity heart diseases. The SAbHD_RM utilizes the data from the knowledge base that is created in consultation with the medical experts. The results of the recommendation model are evaluated using confusion matrix, which gives an accuracy of 97.8%.

5.17 hs-CRP - A New Biomarker

Tayefi et al. (2017) have collected the data based on patients referred to Ghaem Hospital, Mashhad–Iran for coronary angiography between September 2011 and May 2013. Ten variables of the total 12 variables were fed into the DT algorithm (including age, sex, FBG, TG, hs-CRP, TC, HDL, LDL, SBP, and DBP). The proposed model could identify the associated risk factors of CHD with sensitivity, specificity, and accuracy of 96%, 87%, and 94%, respectively. The model had serum hs-CRP levels at top of the tree, followed by FBG, gender, and age. This study shows that hs-CRP, a new biomarker, is strongly associated with CHD even more than traditional biomarkers, such as FBG and LDL. Further studies can be conducted to investigate novel biomarkers of CHD such as hs-CRP, which may improve the model of CHD risk assessments.

6. Discussion

In this study, it is observed that feature selection is the first important step which should output the most required features for the prediction of heart diseases. SVM, PSO, Information Gain, and Gini Index are already being used for feature selection. Genetic Algorithms play a major role in optimizing the inputs. Also combining GA with PSO had produced better results

From the results on comparison of classifiers, a further comparison of Bayes Net and CART algorithms could be performed to determine the best of two. As hs-CRP is strongly associated with heart diseases, this must be included among the variables in future research on prediction of heart diseases.

Conclusion

The selection of best features plays a crucial role in order to achieve accurate prediction results. It is also observed that while using hybrid Machine Learning techniques, Genetic Algorithms play a major role in optimizing the inputs. In future, instead of GA, other evolutionary algorithms could be tested and compared. Further, different combinations of algorithms can be considered and analyzed in order to accurately predict CVD. With inputs from medical experts, better Clinical Decision Support System can be devised, which will assist the experts to easily predict the heart disease and also help patients in early detection of the disease, thereby reducing the cost involved in the diagnosis.

References

[1]. Ahmed, A., & Hannan, S. A. (2012). Data Mining Techniques to find out heart diseases: An overview. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 1(4), 18-23.
[2]. Alpaydin, E. (2014). Introduction to Machine Learning. MIT Press.
[3]. Amin, S. U., Agarwal, K., & Beg, R. (2013, April). Genetic neural network based data mining in prediction of heart disease using risk factors. In Information & Communication Technologies (ICT), 2013 IEEE Conference on (pp. 1227-1231). IEEE.
[4]. Arabasadi, Z., Alizadehsani, R., Roshanzamir, M., Moosaei, H., & Yarifard, A. A. (2017). Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm. Computer Methods and Programs in Biomedicine, 141, 19-26.
[5]. Austin, P. C., Tu, J. V., Ho, J. E., Levy, D., & Lee, D. S. (2013). Using methods from the data-mining and machine-learning literature for disease classification and prediction: A case study examining classification of heart failure subtypes. Journal of Clinical Epidemiology, 66(4), 398-407.
[6]. Bisaso, K. R., Anguzu, G. T., Karungi, S. A., Kiragga, A., & Castelnuovo, B. (2017). A survey of machine learning applications in HIV clinical research and care. Computers in Biology and Medicine, 91, 366-371.
[7]. Chaurasia, V. (2017). Early prediction of heart diseases using data mining techniques. Carib. J. Sci. Tech., 1,208-217.
[8]. Fujita, H., Acharya, U. R., Sudarshan, V. K., Ghista, D. N., Sree, S. V., Eugene, L. W. J., & Koh, J. E. (2016). Sudden Cardiac Death (SCD) prediction based on nonlinear heart rate variability features and SCD index. Applied Soft Computing, 43, 510-519.
[9]. Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157-1182.
[10]. Iftikhar, S., Fatima, K., Rehman, A., Almazyad, A. S., & Saba, T. (2017). An evolution based hybrid approach for heart diseases classification and associated risk factors identification. Biomedical Research, 28(8), 3451-3455.
[11]. Inbarani, H. H., Azar, A. T., & Jothi, G. (2014). Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis. Computer Methods and Programs in Biomedicine, 113(1), 175-185.
[12]. Jabbar, M.A., Deekshatulu, B. L., & Chandra, P. (2013). Classification of heart disease using k-nearest neighbor and genetic algorithm. Procedia Technology, 10, 85-94.
[13]. Kim, J., Kang, U., & Lee, Y. (2017). Statistics and Deep Belief Network-Based Cardiovascular Risk Prediction. Healthcare Informatics Research, 23(3), 169-175.
[14]. Long, N. C., Meesad, P., & Unger, H. (2015). A highly accurate firefly based algorithm for heart disease prediction. Expert Systems with Applications, 42(21), 8221-8231.
[15]. Machine Learning with MATLAB. (n.d.). In MathWorks. Retrieved from https://in.mathworks.com/solutions/ machine-learning.
[16]. Mahmoodabadi, Z., & Tabrizi, S. S. (2015). A new efficient algorithm based on ICA for diagnosis of Coronary Artery Disease. International Journal of Telemedicine and Clinical Practices, 1(2), 157-173.
[17]. Masethe, H. D., & Masethe, M. A. (2014, October). Prediction of heart disease using classification algorithms. In Proceedings of the World Congress on Engineering and Computer Science (Vol. 2, pp. 22-24).
[18]. Mustaqeem, A., Anwar, S. M., Khan, A. R., & Majid, M. (2017). A statistical analysis based recommender model for heart disease patients. International Journal of Medical Informatics, 108, 134-145.
[19]. Parthiban, L., & Subramanian, R. (2008). Intelligent heart disease prediction system using CANFIS and genetic algorithm. International Journal of Medical and Health Sciences, 1(5), 278-281.
[20]. Patel, S. B., Yadav, P. K., & Shukla, D. D. (2013). Predict the diagnosis of heart disease patients using classification mining techniques. IOSR Journal of Agriculture and Veterinary Science (IOSR-JAVS), 4(2), 61-64.
[21]. Paul, A. K., Shill, P. C., Rabin, M. R. I., & Murase, K. (2017). Adaptive weighted fuzzy rule-based system for the risk level assessment of heart disease. Applied Intelligence, 48, 1-18.
[22]. Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2507-2517.
[23]. Shouman, M., Turner, T., & Stocker, R. (2011, December). Using decision tree for diagnosing heart disease patients. In Proceedings of the Ninth Australasian Data Mining Conference (Vol.121, pp. 23-30). Australian Computer Society, Inc.
[24]. Shouman, M., Turner, T., & Stocker, R. (2012, March). Using data mining techniques in heart disease diagnosis and treatment. In Electronics, Communications and Computers (JEC-ECC), 2012 Japan-Egypt Conference on (pp. 173-177). IEEE.
[25]. Singh, J., Kamra, A., & Singhra, H. (2016). Prediction th of Heart Diseases using Associative Classification. 5 International Conference on Wireless Networks and Embedded Systems (WECON) (pp. 1-7). IEEE.
[26]. Soni, J., Ansari, U., Sharma, D., & Soni, S. (2011). Predictive data mining for medical diagnosis: An overview of heart disease prediction. International Journal of Computer Applications, 17(8), 43-48.
[27]. Srinivas, K., Rani, B.K., Govardhan, A. (2010a). Applications of Data Mining Techniques in Healthcare and Prediction of Heart Attacks. International Journal on Computer Science and Engineering, 2(2), 250-255.
[28]. Srinivas, K., Rao, G. R., & Govardhan, A. (2010b, August). Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data mining techniques. In Computer Science and th Education (ICCSE), 2010 5 International Conference on (pp. 1344-1349). IEEE.
[29]. Srinivas, K., Rao, G. R., & Govardhan, A. (2014). Rough-Fuzzy classifier: A system to predict the heart disease by blending two different set theories. Arabian Journal for Science and Engineering, 39(4), 2857-2868
[30]. Tayefi, M., Tajfard, M., Saffar, S., Hanachi, P., Amirabadizadeh, A. R., Esmaeily, H., & Mobarhan, M. G. (2017). hs-CRP is strongly associated with Coronary Heart Disease (CHD): A data mining approach using decision tree algorithm. In Computer Methods and Programs in Biomedicine, 141, 105–109.
[31]. Tu, M. C., Shin, D., & Shin, D. (2009, October). Effective diagnosis of heart disease through bagging approach. In Biomedical Engineering and Informatics, nd 2009. BMEI'09. 2 International Conference on (pp. 1-4). IEEE.
[32]. UCI Machine Learning Repository-Heart Disease Dataset. Retrieved from https://archive.ics.uci.edu/ml/ datasets/Heart+ Disease