Heart Disease Dataset Clusterization

Polina Dudchenko, Aleksei Dudchenko, Georgy Kopanitsa

Research output: Contribution to journalArticle

Abstract

Clusterization is a promising group of methods in the context of patient similarity. However, results of clustering are not often clear for physicians as well as different clustering methods can produce different results. We have examined a well-known dataset and implemented 3 clustering methods (k-means, Agglomerative and Spectral). We have compared and evaluated clusters and their correlation with data attributes. In contrast to original dataset's target value, the clusters correlated with only a few attributes. Finally, we train 2 predictive models based on k-nearest neighbors (KNN) algorithm and Artificial Neural Network (ANN). Models evaluation demonstrates that using the results of clustering algorithms as predictive attribute give a higher F-score than the original target attribute.

Original languageEnglish
Pages (from-to)162-167
Number of pages6
JournalStudies in Health Technology and Informatics
Volume261
Publication statusPublished - 1 Jan 2019
Externally publishedYes

Fingerprint

Cluster Analysis
Heart Diseases
Clustering algorithms
Neural networks
Physicians
Datasets

Keywords

  • clusterization
  • patient classification
  • Patient similarity

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management

Cite this

Heart Disease Dataset Clusterization. / Dudchenko, Polina; Dudchenko, Aleksei; Kopanitsa, Georgy.

In: Studies in Health Technology and Informatics, Vol. 261, 01.01.2019, p. 162-167.

Research output: Contribution to journalArticle

Dudchenko, Polina ; Dudchenko, Aleksei ; Kopanitsa, Georgy. / Heart Disease Dataset Clusterization. In: Studies in Health Technology and Informatics. 2019 ; Vol. 261. pp. 162-167.
@article{d4f56d2cbd9a4f54931dd77875c6c758,
title = "Heart Disease Dataset Clusterization",
abstract = "Clusterization is a promising group of methods in the context of patient similarity. However, results of clustering are not often clear for physicians as well as different clustering methods can produce different results. We have examined a well-known dataset and implemented 3 clustering methods (k-means, Agglomerative and Spectral). We have compared and evaluated clusters and their correlation with data attributes. In contrast to original dataset's target value, the clusters correlated with only a few attributes. Finally, we train 2 predictive models based on k-nearest neighbors (KNN) algorithm and Artificial Neural Network (ANN). Models evaluation demonstrates that using the results of clustering algorithms as predictive attribute give a higher F-score than the original target attribute.",
keywords = "clusterization, patient classification, Patient similarity",
author = "Polina Dudchenko and Aleksei Dudchenko and Georgy Kopanitsa",
year = "2019",
month = "1",
day = "1",
language = "English",
volume = "261",
pages = "162--167",
journal = "Studies in Health Technology and Informatics",
issn = "0926-9630",
publisher = "IOS Press",

}

TY - JOUR

T1 - Heart Disease Dataset Clusterization

AU - Dudchenko, Polina

AU - Dudchenko, Aleksei

AU - Kopanitsa, Georgy

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Clusterization is a promising group of methods in the context of patient similarity. However, results of clustering are not often clear for physicians as well as different clustering methods can produce different results. We have examined a well-known dataset and implemented 3 clustering methods (k-means, Agglomerative and Spectral). We have compared and evaluated clusters and their correlation with data attributes. In contrast to original dataset's target value, the clusters correlated with only a few attributes. Finally, we train 2 predictive models based on k-nearest neighbors (KNN) algorithm and Artificial Neural Network (ANN). Models evaluation demonstrates that using the results of clustering algorithms as predictive attribute give a higher F-score than the original target attribute.

AB - Clusterization is a promising group of methods in the context of patient similarity. However, results of clustering are not often clear for physicians as well as different clustering methods can produce different results. We have examined a well-known dataset and implemented 3 clustering methods (k-means, Agglomerative and Spectral). We have compared and evaluated clusters and their correlation with data attributes. In contrast to original dataset's target value, the clusters correlated with only a few attributes. Finally, we train 2 predictive models based on k-nearest neighbors (KNN) algorithm and Artificial Neural Network (ANN). Models evaluation demonstrates that using the results of clustering algorithms as predictive attribute give a higher F-score than the original target attribute.

KW - clusterization

KW - patient classification

KW - Patient similarity

UR - http://www.scopus.com/inward/record.url?scp=85067121838&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067121838&partnerID=8YFLogxK

M3 - Article

VL - 261

SP - 162

EP - 167

JO - Studies in Health Technology and Informatics

JF - Studies in Health Technology and Informatics

SN - 0926-9630

ER -