Heart Disease Detection Using Machine Learning Models

González Cartagena, Rafael; Ocasio Adorno, Yadriel A.

Ver/

PUPR_SJU_CEAH_URP-HS_2022-2023_ Rafael Gonzalez and Yadriel Ocasio_Poster (801.2Kb)

Fecha

2023-08-31

Autor

González Cartagena, Rafael

Ocasio Adorno, Yadriel A.

Metadatos

Mostrar el registro completo del ítem

Resumen

Heart disease, that is, the set of various health complications that negatively affect the heart, is currently one of the main causes of worldwide deaths in human beings. For instance, in the United States (US), it has the highest mortality rate for both men and women alike amounting to 545,000 deaths in 2021 alone. For this very reason and because of the current advancements in computing technology, this research project studies the accuracy of machine learning algorithms; these being: K – Nearest Neighbor, Gradient Boost and Light GBM, in the detection of heart disease using already compiled data, namely, datasets. Of the three (3) models, it was found that the Light GBM model presented the best results with a 98.5% of accuracy score between the two (2) datasets, followed by the Gradient Boost (95%) and the K – Nearest Neighbor (90.5%). With that being said, the datasets used for this project are Rashik Rahman’s Heart Attack Analysis and Prediction Dataset and David Lapp’s Heart Disease Dataset - Public Health Dataset; both acquired from the Kaggle website. Moreover, regarding the methods and technologies used, these include the Python 3 programming language with its SK-Learn library, Google’s Collaboratory service and various topics associated with Machine Learning, such as: Feature Scaling, Data Imputation and Data Endcoding, to name a few.

URI

http://hdl.handle.net/20.500.12475/1981

Colecciones

Undergraduate Research Program for Honor and Outstanding Students