Domain-independent Data-to-Text Generation for Open Data

Andreas Burgdorf, Micaela Barkmann, André Pomp, Tobias Meisen

2022

Abstract

As a result of the efforts of the Open Data movements, the number of Open Data portals and the amount of data published in them is steadily increasing. An aspect that increases the utilizability of data enormously but is nevertheless often neglected is the enrichment of data with textual data documentation. However, the creation of descriptions of sufficient quality is time-consuming and thus cost-intensive. One approach to solving this problem is Data to text generation which creates descriptions to raw data. In the past, promising results were achieved on data from Wikipedia. Based on a seq2seq model developed for such purposes, we investigate whether this technique can also be applied in the Open Data domain and the associated challenges. In three studies, we reproduce the results obtained from a previous work and apply them to additional datasets with new challenges in terms of data nature and data volume. We can conclude that previous methods are not suitable to be applied in the Open Data sector without further modification, but the results still exceed our expectations and show the potential of applicability.

Download


Paper Citation


in Harvard Style

Burgdorf A., Barkmann M., Pomp A. and Meisen T. (2022). Domain-independent Data-to-Text Generation for Open Data. In Proceedings of the 11th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-583-8, pages 95-106. DOI: 10.5220/0011272900003269


in Bibtex Style

@conference{data22,
author={Andreas Burgdorf and Micaela Barkmann and André Pomp and Tobias Meisen},
title={Domain-independent Data-to-Text Generation for Open Data},
booktitle={Proceedings of the 11th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2022},
pages={95-106},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011272900003269},
isbn={978-989-758-583-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 11th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Domain-independent Data-to-Text Generation for Open Data
SN - 978-989-758-583-8
AU - Burgdorf A.
AU - Barkmann M.
AU - Pomp A.
AU - Meisen T.
PY - 2022
SP - 95
EP - 106
DO - 10.5220/0011272900003269