Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features
Michael Behringer, Pascal Hirmer, Dennis Tschechlov, Bernhard Mitschang
2022
Abstract
Today, the amount of data is growing rapidly, which makes it nearly impossible for human analysts to comprehend the data or to extract any knowledge from it. To cope with this, as part of the knowledge discovery process, many different data mining and machine learning techniques were developed in the past. A famous representative of such techniques is clustering, which allows the identification of different groups of data (the clusters) based on data characteristics. These algorithms need no prior knowledge or configuration, which makes them easy to use, but interpreting and explaining the results can become very difficult for domain experts. Even though different kinds of visualizations for clustering results exist, they do not offer enough details for explaining how the algorithms reached their results. In this paper, we propose a new approach to increase explainability for clustering algorithms. Our approach identifies and selects features that are most meaningful for the clustering result. We conducted a comprehensive evaluation in which, based on 216 synthetic datasets, we first examined various dispersion metrics regarding their suitability to identify meaningful features and we evaluated the achieved precision with respect to different data characteristics. This evaluation shows, that our approach outperforms existing algorithms in 93 percent of the examined datasets.
DownloadPaper Citation
in Harvard Style
Behringer M., Hirmer P., Tschechlov D. and Mitschang B. (2022). Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features. In Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-569-2, pages 364-373. DOI: 10.5220/0011092000003179
in Bibtex Style
@conference{iceis22,
author={Michael Behringer and Pascal Hirmer and Dennis Tschechlov and Bernhard Mitschang},
title={Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features},
booktitle={Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2022},
pages={364-373},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011092000003179},
isbn={978-989-758-569-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features
SN - 978-989-758-569-2
AU - Behringer M.
AU - Hirmer P.
AU - Tschechlov D.
AU - Mitschang B.
PY - 2022
SP - 364
EP - 373
DO - 10.5220/0011092000003179