Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features

Michael Behringer; Pascal Hirmer; Dennis Tschechlov; Bernhard Mitschang

doi:10.5220/0011092000003179

Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features

Michael Behringer, Pascal Hirmer, Dennis Tschechlov, Bernhard Mitschang

2022

Abstract

Today, the amount of data is growing rapidly, which makes it nearly impossible for human analysts to comprehend the data or to extract any knowledge from it. To cope with this, as part of the knowledge discovery process, many different data mining and machine learning techniques were developed in the past. A famous representative of such techniques is clustering, which allows the identification of different groups of data (the clusters) based on data characteristics. These algorithms need no prior knowledge or configuration, which makes them easy to use, but interpreting and explaining the results can become very difficult for domain experts. Even though different kinds of visualizations for clustering results exist, they do not offer enough details for explaining how the algorithms reached their results. In this paper, we propose a new approach to increase explainability for clustering algorithms. Our approach identifies and selects features that are most meaningful for the clustering result. We conducted a comprehensive evaluation in which, based on 216 synthetic datasets, we first examined various dispersion metrics regarding their suitability to identify meaningful features and we evaluated the achieved precision with respect to different data characteristics. This evaluation shows, that our approach outperforms existing algorithms in 93 percent of the examined datasets.

Download

Paper Citation

in Harvard Style

Behringer M., Hirmer P., Tschechlov D. and Mitschang B. (2022). Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features. In Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-569-2, pages 364-373. DOI: 10.5220/0011092000003179

in Bibtex Style

@conference{iceis22,
author={Michael Behringer and Pascal Hirmer and Dennis Tschechlov and Bernhard Mitschang},
title={Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features},
booktitle={Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2022},
pages={364-373},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011092000003179},
isbn={978-989-758-569-2},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features
SN - 978-989-758-569-2
AU - Behringer M.
AU - Hirmer P.
AU - Tschechlov D.
AU - Mitschang B.
PY - 2022
SP - 364
EP - 373
DO - 10.5220/0011092000003179