loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Diogo Machado 1 ; 2 ; Vítor Santos Costa 3 ; 2 and Pedro Brandão 1 ; 2

Affiliations: 1 Instituto de Telecomunicações, Portugal ; 2 Faculty of Science of the University of Porto, Portugal ; 3 INESC-TEC, Portugal

Keyword(s): Data Mining, Diabetes, Data-Balance, over-Sampling, Under-Sampling.

Abstract: Imbalanced data sets pose a complex problem in data mining. Health related data sets, where the positive class is connected to the existence of an anomaly, are prone to be imbalanced. Data related to diabetes management follows this trend. In the case of diabetes, patients avoid situations of hypo/hyperglycaemia, which is the anomaly we want to detect. The use of balancing methods can provide more examples of the minority class, and assist the classifier by clearing the decision boundary. Nevertheless, each over-sampling and under-sampling method can affect the data set uniquely, which will influence the classifier’s performance. In this work, the authors studied the impact of the most known data-balancing methods applied to the Ohio and St. Louis diabetes related data sets. The best and most robust approach was the use of ENN with SMOTE. This hybrid method produced significant performance gains on all the performed tests. ENN in particular had a meaningful impact on all the tests. G iven the limited volume of glycaemia-based data available for diabetes management, over-sampling methods would be expected to have a greater role in improving the classifier’s performance. In our experiments, the clearing of noise values by the under-sampling methods, produced better results. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.147.27.152

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Machado, D., Santos Costa, V. and Brandão, P. (2023). Using Balancing Methods to Improve Glycaemia-Based Data Mining. In Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - HEALTHINF; ISBN 978-989-758-631-6; ISSN 2184-4305, SciTePress, pages 188-198. DOI: 10.5220/0011797100003414

@conference{healthinf23,
author={Diogo Machado and Vítor {Santos Costa} and Pedro Brandão},
title={Using Balancing Methods to Improve Glycaemia-Based Data Mining},
booktitle={Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - HEALTHINF},
year={2023},
pages={188-198},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011797100003414},
isbn={978-989-758-631-6},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - HEALTHINF
TI - Using Balancing Methods to Improve Glycaemia-Based Data Mining
SN - 978-989-758-631-6
IS - 2184-4305
AU - Machado, D.
AU - Santos Costa, V.
AU - Brandão, P.
PY - 2023
SP - 188
EP - 198
DO - 10.5220/0011797100003414
PB - SciTePress