Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
A Deductive Learning of Heart Disease Dataset by using K Means Clustering
Arulanantham Zechariah Jebakumar1*, Dr. R. Ravanan2
1Lecturer, Prince Sultan Military College of Health Sciences, Dhahran PO Box: 33048, Dammam – 31448, Kingdom of Saudi Arabia.
2Joint Director of Collegiate Education, Chennai region, Chennai-15, Tamilnadu, India Corresponding author: Arulanantham Zechariah Jebakumar
Email: [email protected]
Abstract
Cardiovascular diseases is one of the most significant causes of mortality in today’s world. Cardiovascular diseases are the number one cause of death globally with 17.9 million death cases each year. CVDs are concertedly contributed by hypertension, diabetes, overweight and unhealthy lifestyles. Exploratory Data Analysis is a pre- processing step to understand the data. There are numerous methods and steps in performing EDA, however, most of them are specific, focusing on visualization and distribution. If the number of cluster is 2, this model has 43% &
57% of cluster instances for full training set and 46% & 54% of cluster instances for 66% training set, if the number of cluster is 3, this model has 18% 48% & 34% of cluster instances for full training set and 25%,50% & 25% of cluster instances for 66% training set, if the number of cluster is 4, this model has 21%,40%,10% & 28% of cluster instances for full training set and 24%,13%,26% and 37% of cluster instances for 66% training set, If the number of cluster is 5, this model has 17%,31%,11%,19% & 21% of cluster instances for full training set and 23%,14%,20%,33% &11% of cluster instances for 66% training set, If the number of cluster is 6, this model has 10%,31%,15%,20%,6% &18% of cluster instances for full training set and 16%,18%,15%,22%,13% &15% of cluster instances for 66% training set. In this system proposes the optimal results for build the deductive learning model. Based on the time consumption the system recommends that cluster 2, 3 and 5 have zero second taken the time consumption for build the model in 66% training set. 0.01 seconds for cluster 6 and 0.03 seconds for cluster 4 in 66% training set models. Cluster 5 and 6 have low sum of squared errors for full training and 66% training set comparatively other models.
Keywords: K Means clustering, Centroids, Sum of Squared Errors, Iterations.
Introduction
In this section presents introduction of this research work. 17.9 million people die every year due to heart diseases accounting for 31% of all the deaths in the world. [1]Thus, it is important for early and accurate detection of heart diseases.[2] 4 out of 5 Heart disease patients die due to a heart attack or a stroke, and
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
raised blood pressure, glucose, and lipids along with overweight, and obesity. [6]Lifestyle also plays an important factor in heart diseases along with physiological factors. [7]Tobacco use, unhealthy diet, excessive alcohol intake, and inadequate physical activity are leading reasons for heart diseases.[8]Identifying such people and ensuring they are given appropriate treatment could prevent premature deaths.
In this paper presents section 2 of this paper explains the detail on the related works. In section 3 presents the materials and methods adopted and section 4 presents the details of the experiments and discussions. Finally section 5 concludes the paper by sharing our inferences and future plans.
Related Works
In this section presents focuses the related works of this research work.The accuracy and precision statistics for different algorithms such as Support Vector machines, KNN, Decision Trees, and Neural networks being most popular.[9]TheUCI dataset for comparison of different classifiers such as Multilayer perceptron ,Naive Bayes, KNN etc. and validated that SVM with boosting hyper parameters outperformed others.[10] The machine learningtechniques providing the accuracy of 88.7% in prediction of cardiovascular diseases with a hybrid random forest and linear model.[11]New selection features and methods can be adopted to get broader perception of performance.[12]The traditional machine learning algorithms that aim in improving the accuracy of heart disease prediction. [13]The UK Biobank dataset observed that rather than complex models, information gain was better by consideration of different risk factors.a south African dataset consisting of 462 instances for analyzing algorithms such as Naive Bayes, SVM , and decision trees.[14]Naive bayes obtained good accuracy results however specificity and sensitivity results can be improved with more instances.[15]the accuracy of decision trees in the prediction of heart diseases with the help of a dataset consisting of 573 instances. More number of attributes and hyper parameters can result in better performance classification.[16] Association rules, clustering and other data mining algorithms prove to be useful to mine huge amounts of unstructured data. [17]Various kernel implementations with certain rulebased classifiers.[18]It concludes that the RBF kernel is best for infinite data and Hyper parameter tuning can be added to make the model more effective. [19]
Materials and Methods
In this section presents the materials and methods of this research work. This research work focuses exploratory data analysis and using Weka 3.8.3. The dataset used in this work is UCI Heart Disease dataset. It has 76 features (attributes) from 303 patients. This work uses the dataset consisting of 270 patients with 14 features set.
Table 1: Meta Data Description
S.No Attribute Description of the Attribute Type of the Range
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Female=87 3 chest pain type
(cp)
Type of the chest pain Categorical Asymtomatic=129 Non Angina=79 Atypical Angina=42 Typical Angina=20 4 resting blood
pressure (restbps)
in mm Hg on admission to the hospital
Continuous Minimum=94
Maximum=200 Mean= 131.34 StdDeviation=17.86
5 serum
cholestoral (chol)
serum cholestoral in mg/dl Continuous Minimum=126 Maximum=564 Mean= 249.66 StdDeviation=51.69
6 fasting blood sugar (fbs)
0=false;
1=true
Binary False=230
True=40 7 Resting ECG
(restecg)
(fbs>120 mg/dl) 0=Normal ;
1=Having ST-T wave abnormality;
2=Showing probable or define left ventricular hypertrophy
Categorical Normal=131
ST-T Wave Abnormality=2 Left Ventriclar
Hypertrophy=137
8 maximum heart rate achieved (thalach)
maximum heart rate reached Continuous Minimum=71 Maximum=202 Mean= 149.68 StdDeviation=23.17
9 exercise induced angina (exang)
0=No;
1=Yes
Binary No=181
Yes=89 10 oldpeak ST depreve to restserelatission
induced by exercise relative to rest
Continuous Minimum=0
Maximum=6.2 Mean= 1.05 StdDeviation=1.145
11 slope the slope of the peak exercise ST segment
0=upsloping;
1=Flat;
2=Downsloping
Categorical Flat=122 Upsloping=130 Downsloping=18
12 ca number of major vessels(0-3) colored by flourosopy 0=Typical Angina 1=Atypical Angina 2=Non Anginal Pain 3=Asymptomatic
Categorical Typical Angina=160 Atypical Angina=58 Non AnginalPain=33 Asymptomatic=19
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 1: Architecture of Proposed System
Results and Discussion
In this section focuses the results and discussions of this research work. This project covers exploratory data analysis like data visualization and implementing K means clustering approaches by using Weka 3.8.3.
Import and get the data from UCI repository
Data Cleaning and Preprocessing
Implementing K Means clustering
Data Visualization and Interpretations
Model Evolution
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 3: Visualization of Target Attribute
Figure 4: Visualization of ThalAttribute
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 5: Visualization of CA Attribute
Figure 6: Visualization of Slop Attribute
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 7: Visualization of Old Peak Attribute
Figure 8: Visualization of Exercise_Induced_Angina Attribute
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 9: Visualization of Max_Heart Rate Attribute
Figure 10: Visualization of Rest_ECGAttribute
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 11: Visualization of Fasting_ECGAttribute
Figure 12: Visualization of Serum_CholastralAttribute
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 13: Visualization of Resting_Blood Pressure Attribute
Figure 14: Visualization of Chest PainAttribute
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 15: Visualization of Sex Attribute
Figure 16: Visualization of Age Attribute
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 17: K Means cluster No=2
Figure 18: K Means cluster=3
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 19: K Means cluster=4
Figure 20: K Means cluster=5
The above pictures shown that the K Means clusters of all attributes (14 attributes) in the heart disease dataset for implementing deductive learning process.
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
seconds)
A B A B A B A B
1 2 3 4 710.46 466.36 0-115(43%)
1-155(57%)
0-42 (46%) 1-50 (54%)
0.01 0
2 3 4 4 648.26 426.63 0-49(18%)
1-130(48%) 2-91(34%)
0-23(25%) 1-46(50%) 2-23(25%)
0.01 0
3 4 5 8 608.71 398.56 0-57(21%)
1-109(40%) 2-28(10%) 3-76(28%)
0-22(24%) 1-12(13%) 2-24(26%) 3-34(37%)
0.01 0.03
4 5 6 5 581.95 379.77 0-45(17%)
1-85(31%) 2-31(11%) 3-51(19%) 4-58(21%)
0-21(23%) 1-13(14%) 2-18(20%) 3-30(33%) 4-10(11%)
0.01 0
5 6 7 9 572.62 355.02 0-27(10%)
1-83(31%) 2-40(15%) 3-55(20%) 4-17(6%) 5-48(18%)
0 -15 (16%) 1-17(18%) 2-14(15%) 3-20(22%) 4-12(13%) 5-14(15%)
0.02 0.01
Cluster Model (Full Training Set) =A
Cluster Model(66% Split)=B
* Implementing Euclidean distance (or similarity) function.
The above table represents that the various measurements producing while implementing full training and 66% training set of the heart disease dataset.
The below table represents that the centroid clusters of K means clusters for full and 66% training set in Weka 3.8.3 tool.
Table 3: Centroid clusters of K Means Clusters for Full / 66% Training set Cluster Centroids / Clustering model (full training set)
S.No Nu mbe
r of
Initial starting points (random)
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Hypertrophy',160,No,3.6,Downsloping,'Non AnginalPain',Normal,'No Disease'
Cluster 1: 42,Male,Asymptomatic,140,226,FALSE,Normal,178,No,0,Upsloping,'Typical Angina',Normal,Disease
2 3 Cluster 0: 62,Female,Asymptomatic,140,268,FALSE,'Left Ventricular
Hypertrophy',160,No,3.6,Downsloping,'Non AnginalPain',Normal,'No Disease'
Cluster 1: 42,Male,Asymptomatic,140,226,FALSE,Normal,178,No,0,Upsloping,'Typical Angina',Normal,Disease
Cluster 2: 60,Male,Asymptomatic,117,230,TRUE,Normal,160,Yes,1.4,Upsloping,'Non AnginalPain','Reversible defect ','No Disease'
3 4 Cluster 0: 62,Female,Asymptomatic,140,268,FALSE,'Left Ventricular
Hypertrophy',160,No,3.6,Downsloping,'Non AnginalPain',Normal,'No Disease'
Cluster 1: 42,Male,Asymptomatic,140,226,FALSE,Normal,178,No,0,Upsloping,'Typical Angina',Normal,Disease
Cluster 2: 60,Male,Asymptomatic,117,230,TRUE,Normal,160,Yes,1.4,Upsloping,'Non AnginalPain','Reversible defect ','No Disease'
Cluster 3: 64,Male,Asymptomatic,128,263,FALSE,Normal,105,Yes,0.2,Flat,'Atypical Angina','Reversible defect ',Disease
4 5 Cluster 0: 62,Female,Asymptomatic,140,268,FALSE,'Left Ventricular
Hypertrophy',160,No,3.6,Downsloping,'Non AnginalPain',Normal,'No Disease'
Cluster 1: 42,Male,Asymptomatic,140,226,FALSE,Normal,178,No,0,Upsloping,'Typical Angina',Normal,Disease
Cluster 2: 60,Male,Asymptomatic,117,230,TRUE,Normal,160,Yes,1.4,Upsloping,'Non AnginalPain','Reversible defect ','No Disease'
Cluster 3: 64,Male,Asymptomatic,128,263,FALSE,Normal,105,Yes,0.2,Flat,'Atypical Angina','Reversible defect ',Disease
Cluster 4: 57,Female,Asymptomatic,128,303,FALSE,'Left Ventricular Hypertrophy',159,No,0,Upsloping,'Atypical Angina',Normal,Disease 5 6 Cluster 0: 62,Female,Asymptomatic,140,268,FALSE,'Left Ventricular
Hypertrophy',160,No,3.6,Downsloping,'Non AnginalPain',Normal,'No Disease'
Cluster 1: 42,Male,Asymptomatic,140,226,FALSE,Normal,178,No,0,Upsloping,'Typical Angina',Normal,Disease
Cluster 2: 60,Male,Asymptomatic,117,230,TRUE,Normal,160,Yes,1.4,Upsloping,'Non AnginalPain','Reversible defect ','No Disease'
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Cluster 5: 50,Female,Asymptomatic,110,254,FALSE,'Left Ventricular Hypertrophy',159,No,0,Upsloping,'Typical Angina',Normal,Disease
Cluster Centroids / Clustering model (66% Slit)
1 2 Cluster 0: 48,Male,'Non Anginal Pain',124,255,TRUE,Normal,175,No,0,Upsloping,'Non Anginal Pain',Normal,Disease
Cluster 1: 38,Male,'Typical Angina',120,231,FALSE,Normal,182,Yes,3.8,Flat,'Typical Angina','Reversible defect ','No Disease'
2 3 Cluster 0: 48,Male,'Non Anginal Pain',124,255,TRUE,Normal,175,No,0,Upsloping,'Non Anginal Pain',Normal,Disease
Cluster 1: 38,Male,'Typical Angina',120,231,FALSE,Normal,182,Yes,3.8,Flat,'Typical Angina','Reversible defect ','No Disease'
Cluster 2: 44,Male,'Atypical Angina',120,263,FALSE,Normal,173,No,0,Upsloping,'Typical Angina','Reversible defect ',Disease
3 4 Cluster 0: 48,Male,'Non Anginal Pain',124,255,TRUE,Normal,175,No,0,Upsloping,'Non Anginal Pain',Normal,Disease
Cluster 1: 38,Male,'Typical Angina',120,231,FALSE,Normal,182,Yes,3.8,Flat,'Typical Angina','Reversible defect ','No Disease'
Cluster 2: 44,Male,'Atypical Angina',120,263,FALSE,Normal,173,No,0,Upsloping,'Typical Angina','Reversible defect ',Disease
Cluster 3: 61,Male,Asymptomatic,120,260,FALSE,Normal,140,Yes,3.6,Flat,'Atypical Angina','Reversible defect ','No Disease'
4 5 Cluster 0: 48,Male,'Non Anginal Pain',124,255,TRUE,Normal,175,No,0,Upsloping,'Non Anginal Pain',Normal,Disease
Cluster 1: 38,Male,'Typical Angina',120,231,FALSE,Normal,182,Yes,3.8,Flat,'Typical Angina','Reversible defect ','No Disease'
Cluster 2: 44,Male,'Atypical Angina',120,263,FALSE,Normal,173,No,0,Upsloping,'Typical Angina','Reversible defect ',Disease
Cluster 3: 61,Male,Asymptomatic,120,260,FALSE,Normal,140,Yes,3.6,Flat,'Atypical Angina','Reversible defect ','No Disease'
Cluster 4: 58,Male,Asymptomatic,150,270,FALSE,'Left Ventricular
Hypertrophy',111,Yes,0.8,Upsloping,'Typical Angina','Reversible defect ','No Disease' 5 6 Cluster 0: 48,Male,'Non Anginal Pain',124,255,TRUE,Normal,175,No,0,Upsloping,'Non
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Cluster 2: 44,Male,'Atypical Angina',120,263,FALSE,Normal,173,No,0,Upsloping,'Typical Angina','Reversible defect ',Disease
Cluster 3: 61,Male,Asymptomatic,120,260,FALSE,Normal,140,Yes,3.6,Flat,'Atypical Angina','Reversible defect ','No Disease'
Cluster 4: 58,Male,Asymptomatic,150,270,FALSE,'Left Ventricular
Hypertrophy',111,Yes,0.8,Upsloping,'Typical Angina','Reversible defect ','No Disease' Cluster 5: 67,Male,Asymptomatic,120,237,FALSE,Normal,71,No,1,Flat,'Typical Angina',Normal,'No Disease'
Figure 21: K Means cluster Vs Iterations
The above diagram clearly shows that number of cluster is 2, the model produces 3 iterations for full training set and 4 iterations for 66% training set, if the number of cluster is 3, the model produces 4 iterations for full training set and 66% training set, if the number of cluster is 4, the model produces 5 iterations for full training set and 8 iterations for 66% training set, If the number of cluster is 5, the model produces the 6 iterations for full training set and 5 iterations for 66% training set, If the number of cluster is 6, the model produces 7 iterations for full training set and 9 iterations for 66% training set.
3
4
5
6
7
4 4
8
5
9
0 1 2 3 4 5 6 7 8 9 10
2 3 4 5 6
Number of Iterations
Number of Clusters
K Means Clusters Vs Iterations
Full Training Set 66% Traning Set
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
Figure 22: K Means cluster Vs SSE
The above diagram clearly shows that number of cluster is 2, this model has 710.46 sum of squared errors for full training set and 466.36 sum of squared errors for 66% training set, if the number of cluster is 3, this model has 648.26 sum of squared errors for full training set and 426.63 sum of squared errors for 66% training set, if the number of cluster is 4, this model has 608.71 sum of squared errors for full training set and 398.56 sum of squared errors for 66% training set, If the number of cluster is 5, this model has 581.95 sum of squared errors for full training set and 379.77 sum of squared errors for 66% training set ,If the number of cluster is 6 this model has 572.62 sum of squared errors for full training set and 355.02 sum of squared errors for 66% training set.
710.46 648.26 608.71 581.95 572.62
466.36 426.63 398.56 379.77 355.02
0 100 200 300 400 500 600 700 800
2 3 4 5 6
Sum of Squared Errors
Number of Clusters
K Means Clusters Vs Sum of Squared Errors
66% Traning Set Full Training Set
0.01 0.01 0.01 0.01
0.02
0 0
0.03
0
0.01
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035
2 3 4 5 6
Time(In Seconds)
K Means Clusters Vs Time taken to build the model(In Seconds)
Full Training Set 66% Traning Set
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
The above diagram clearly shows that number of cluster is 2, this model has taken the time to build the model is 0.01 seconds for full training set and zero second for 66% training set, if the number of cluster is 3, this model has taken the time to build the model is 0.01 seconds for full training set and zero second for 66% training set, if the number of cluster is 4, this model has taken the time to build the model is 0.01 seconds for full training set and 0.03 seconds for 66% training set, If the number of cluster is 5, this model has taken the time to build the model is zero second for full training set and 0.01 seconds for 66% training set ,If the number of cluster is 6,this model has taken the time to build the model is 0.02 seconds for full training set and 0.01 seconds for 66% training set.
43%
57%
18%
48%
34%
21%
40%
10%
28%
17%
31%
11%
19%
21%
10%
31%
15%
20%
6%
18%
46%
54%
25%
50%
25% 24%
13%
26%
37%
23%
14%
20%
33%
11%
16%
18%
15%
22%
13%
15%
2 3 4 5 6
Cluster Instances
K Means Clusters Vs Cluster Instances
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
The above diagram clearly shows that number of cluster is 2, this model has 43% & 57% of cluster instances for full training set and 46% & 54% of cluster instances for 66% training set, if the number of cluster is 3, this model has 18% 48% & 34% of cluster instances for full training set and 25%,50% & 25% of cluster instances for 66% training set, if the number of cluster is 4, this model has 21%,40%,10% & 28% of cluster instances for full training set and 24%,13%,26% and 37% of cluster instances for 66% training set, If the number of cluster is 5, this model has 17%,31%,11%,19% & 21% of cluster instances for full training set and 23%,14%,20%,33% &11% of cluster instances for 66% training set, If the number of cluster is 6, this model has 10%,31%,15%,20%,6% &18% of cluster instances for full training set and 16%,18%,15%,22%,13% &15% of cluster instances for 66% training set.Based on the time consumption the system recommends that cluster 2, 3 and 5 have zero second taken the time consumption for build the model in 66% training set. 0.01 seconds for cluster 6 and 0.03 seconds for cluster 4 in 66% training set models. Cluster 5 and 6 have low sum of squared errors for full training and 66% training set comparatively other models.
Conclusion
Finally this work concludes that when the proposed model has 6 clusters, it has more number of iteration to build the model like full training set has 7 iterations and 44% testing test has 9 iterations with 572.62 sum of squared error for full training set and 355.02 for 44% test set. It has taken the time to build the model 0.02 seconds for full training set and 0.01 second for 66% training set. This model produces the low sum of squared errors comparatively other models.
References
[1] G. Ayyappan ,K.Sivakumar, Heart Disease Data Set Classifications: Comparisons Of Correlation Co Efficient By Applying Various Parameters In Gaussian Processes, Indian Journal of Computer Science and Engineering (IJCSE) , Vol. 9 No. 5 Oct-Nov 2018, Page Number130-134, e-ISSN : 0976-5166, p-ISSN : 2231-3850.
[2] Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J., Sandhu, S., Guppy, K., Lee, S.,
&Froelicher, V. (1989). International application of a new probability algorithm for the diagnosis of coronary artery disease. American Journal of Cardiology, 64,304--310.
[3] David W. Aha & Dennis Kibler. "Instance-based prediction of heart-disease presence with the Cleveland database.
[4] S. Mohan, C. Thirumalai, G. Srivastava, 2019. Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques. J. IEEE Access, vol. 7, pp. 81542-81554, 2019, doi:
10.1109/ACCESS.2019.2923707.
[5] Chandna, Deepali, 2014. Diagnosis of Heart Disease Using Data Mining Algorithm.
Annals of R.S.C.B., ISSN:1583-6258, Vol. 25, Issue 6, 2021, Pages. 4269 - 4289 Received 25 April 2021; Accepted 08 May 2021.
[7] Karthiga, A. Sankari, M. Safish Mary, M. Yogasins, 2017. Early Prediction of Heart Disease Using Decision Tree Algorithm. International Journal of Advanced Research in Basic Engineering Sciences and Technology 3.3 (2017).
[8] C. Sowmiya, P. Sumitra, 2017. Analytical study of heart disease diagnosis using classification techniques. IEEE International Conference on Intelligent Techniques in Control, Optimization and Signal Processing (INCOS), Srivilliputhur, 2017, pp. 1-5, doi:
10.1109/ITCOSP.2017.8303115.
[9] Bahadur, Shamsher, 2013. Predict the Diagnosis of Heart Disease Patients Using Classification Mining Techniques. IOSR Journal of Agriculture and Veterinary Science. 4. 60-64. 10.9790/2380-0426164.
[10] G. Ayyappan ,K.Sivakumar, Heart Disease Data Set Classifications: Comparisons Of Correlation Co Efficient By Applying Various Parameters In Gaussian Processes, Indian Journal of Computer Science and Engineering (IJCSE) , Vol. 9 No. 5 Oct-Nov 2018, Page Number135-140, e-ISSN : 0976-5166, p-ISSN : 2231-3850.
[11] Gennari, J.H., Langley, P, & Fisher, D. (1989). Models of incremental concept formation. Artificial Intelligence, 40, 11--61.
[12] https://www.kaggle.com/mruanova/predict-heart-disease-using-random-forests#Random-Forest-Classifier [13] https://www.kaggle.com/nyjoey/heart-disease
[14] https://towardsdatascience.com/exploratory-data-analysis-on-heart-disease-uci-data-set-ae129e47b323 [15] C. Sowmiya, P. Sumitra, 2017. Analytical study of heart disease diagnosis using classification
techniques. IEEE International Conference on Intelligent Techniques in Control, Optimization and Signal Processing (INCOS), Srivilliputhur, 2017, pp. 1-5, doi:
10.1109/ITCOSP.2017.8303115.
[16] Parthiban, G., Srivatsa, Shesh, 2012. Applying Machine Learning Methods in Diagnosing Heart Disease for Diabetic Patients. International Journal of Applied Information Systems. 3. 25-30.
10.5120/ijais12-450593.
[17] Cömert, Z., A. F. Kocamaz, 2017. Comparison of machine learning techniques for fetal heart rate classification. Acta Phys. Pol. A 132.3 (2017): 451-454.
[18] Patel, Jaymin, Tejalupadhyay, Samir, Patel,Samir, 2016. Heart Disease Prediction using Machine learning and Data Mining Technique. International Journal of Computing Science and Communication10.090592/IJCSC.2016.018.
[19] S. Pouriyeh, S. Vahid, G. Sannino, G. De Pietro, H. Arabnia, J. Gutierrez, 2017. A comprehensive investigation and comparison of Machine Learning Techniques in the domain of heart disease.
IEEE Symposium on Computers and Communications (ISCC), Heraklion, 2017, pp. 204-207, doi: