Comparison of Decision Tree, Neural Network, Statistic Learning, and k-NN Algorithms in Data Mining of Thyroid Disease Datasets

Wafaa Al Somali, Riyad Al Shammari

2018

Abstract

Massive information contained in medical datasets presents challenge to the practitioners in diagnosing diseases or determining health status of patients. Data mining is therefore required to help users obtaining valuable information from a very complex data collection. In this study, we explored several methods of data mining in order to improve the quality of a dataset which is related to diagnosis of thyroid disease. Several classifiers were trained on the dataset and compared to previous study by Akbaş et al (2013). The performance improvement was examined in order to determine the best classifier that can be executed. Findings revealed that decision tree (J48) algorithm outperformed all other algorithms in terms of accuracy, Kappa, Matthew’s correlation coefficient (MCC), and receiver operating characteristics (ROC) with respective values of 0.994, 0.951, 0.953, and 0.987. Classification using J48 was found to be better than those conducted by Akbaş et al. In contrast, IBK algorithm showed the poorest performance, particularly Kappa and MCC. The size of tree generated from J48 and Logistic Model Tree (LMT) varied greatly. Integration of single classifier with AdaBoost classifier mostly resulted in higher accuracy. However, AdaBoost did not improve the performance of NaïveBayes, IBK and RandomForest algorithms. These results were consistent with the previous study using AdaBoost-based ensemble classifier.

Download


Paper Citation


in Harvard Style

Al Somali W. and Al Shammari R. (2018). Comparison of Decision Tree, Neural Network, Statistic Learning, and k-NN Algorithms in Data Mining of Thyroid Disease Datasets. In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - Volume 5: HEALTHINF; ISBN 978-989-758-281-3, SciTePress, pages 241-246. DOI: 10.5220/0006479602410246


in Bibtex Style

@conference{healthinf18,
author={Wafaa Al Somali and Riyad Al Shammari},
title={Comparison of Decision Tree, Neural Network, Statistic Learning, and k-NN Algorithms in Data Mining of Thyroid Disease Datasets},
booktitle={Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - Volume 5: HEALTHINF},
year={2018},
pages={241-246},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006479602410246},
isbn={978-989-758-281-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - Volume 5: HEALTHINF
TI - Comparison of Decision Tree, Neural Network, Statistic Learning, and k-NN Algorithms in Data Mining of Thyroid Disease Datasets
SN - 978-989-758-281-3
AU - Al Somali W.
AU - Al Shammari R.
PY - 2018
SP - 241
EP - 246
DO - 10.5220/0006479602410246
PB - SciTePress