Heart Disease Detection Using Machine Learning Models
Fecha
2023-08-31Autor
González Cartagena, Rafael
Ocasio Adorno, Yadriel A.
Metadatos
Mostrar el registro completo del ítemResumen
Heart disease, that is, the set of various health complications that negatively affect the heart, is currently one of the main causes of worldwide deaths in human beings. For instance, in the United States (US), it has the highest mortality rate for both men and women alike amounting to 545,000 deaths in 2021 alone. For this very reason and because of the current advancements in computing technology, this research project studies the accuracy of machine learning algorithms; these being: K – Nearest Neighbor, Gradient Boost and Light GBM, in the detection of heart disease using already compiled data, namely, datasets. Of the three (3) models, it was found that the Light GBM model presented the best results with a 98.5% of accuracy score between the two (2) datasets, followed by the Gradient Boost (95%) and the K – Nearest Neighbor (90.5%). With that being said, the datasets used for this project are Rashik Rahman’s Heart Attack Analysis and Prediction Dataset and David Lapp’s Heart Disease Dataset - Public Health Dataset; both acquired from the Kaggle website. Moreover, regarding the methods and technologies used, these include the Python 3 programming language with its SK-Learn library, Google’s Collaboratory service and various topics associated with Machine Learning, such as: Feature Scaling, Data Imputation and Data Endcoding, to name a few.