Asynchronous Data Provenance for Research Data in a Distributed System
Benedikt Heinrichs, Marius Politze
2021
Abstract
Many provenance systems assume that the data flow is being directly orchestrated by them or logs are present which describe it. This works well until these assumptions do not hold anymore. The Coscine platform is a way for researchers to connect to different storage providers and annotate their stored data with discipline-specific metadata. These storage providers, however, do not inform the platform of externally induced changes for example by the user. Therefore, this paper focuses on the need of data provenance that is not directly produced and has to be deduced after the fact. An approach is proposed for dealing with and creating such asynchronous data provenance which makes use of change indicators that deduce if a data entity has been modified. A representation on how to describe such an asynchronous data provenance in the Resource Description Framework (RDF) is discussed. Finally, a prototypical implementation of the approach in the Coscine use-case is described and the future steps for the approach and prototype are detailed.
DownloadPaper Citation
in Harvard Style
Heinrichs B. and Politze M. (2021). Asynchronous Data Provenance for Research Data in a Distributed System. In Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-758-509-8, pages 361-367. DOI: 10.5220/0010478003610367
in Bibtex Style
@conference{iceis21,
author={Benedikt Heinrichs and Marius Politze},
title={Asynchronous Data Provenance for Research Data in a Distributed System},
booktitle={Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2021},
pages={361-367},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010478003610367},
isbn={978-989-758-509-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - Asynchronous Data Provenance for Research Data in a Distributed System
SN - 978-989-758-509-8
AU - Heinrichs B.
AU - Politze M.
PY - 2021
SP - 361
EP - 367
DO - 10.5220/0010478003610367