SDRank: A Deep Learning Approach for Similarity Ranking of Data Sources to Support User-Centric Data Analysis

Michael Behringer, Dennis Treder-Tschechlov, Julius Voggesberger, Pascal Hirmer, Bernhard Mitschang

2023

Abstract

Today, data analytics is widely used throughout many domains to identify new trends, opportunities, or risks and improve decision-making. By doing so, various heterogeneous data sources must be selected to form the foundation for knowledge discovery driven by data analytics. However, discovering and selecting the suitable and valuable data sources to improve the analytics results is a great challenge. Domain experts can easily become overwhelmed in the data selection process due to a large amount of available data sources that might contain similar kinds of information. Supporting domain experts in discovering and selecting the best suitable data sources can save time, costs and significantly increase the quality of the analytics results. In this paper, we introduce a novel approach – SDRank – which provides a Deep Learning approach to rank data sources based on their similarity to already selected data sources. We implemented SDRank, trained various models on 4 860 datasets, and measured the achieved precision for evaluation purposes. By doing so, we showed that SDRank is able to highly improve the workflow of domain experts to select beneficial data sources.

Download


Paper Citation


in Harvard Style

Behringer M., Treder-Tschechlov D., Voggesberger J., Hirmer P. and Mitschang B. (2023). SDRank: A Deep Learning Approach for Similarity Ranking of Data Sources to Support User-Centric Data Analysis. In Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-648-4, SciTePress, pages 419-428. DOI: 10.5220/0011998300003467


in Bibtex Style

@conference{iceis23,
author={Michael Behringer and Dennis Treder-Tschechlov and Julius Voggesberger and Pascal Hirmer and Bernhard Mitschang},
title={SDRank: A Deep Learning Approach for Similarity Ranking of Data Sources to Support User-Centric Data Analysis},
booktitle={Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2023},
pages={419-428},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011998300003467},
isbn={978-989-758-648-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - SDRank: A Deep Learning Approach for Similarity Ranking of Data Sources to Support User-Centric Data Analysis
SN - 978-989-758-648-4
AU - Behringer M.
AU - Treder-Tschechlov D.
AU - Voggesberger J.
AU - Hirmer P.
AU - Mitschang B.
PY - 2023
SP - 419
EP - 428
DO - 10.5220/0011998300003467
PB - SciTePress