Domain-independent Data-to-Text Generation for Open Data
Andreas Burgdorf, Micaela Barkmann, André Pomp, Tobias Meisen
2022
Abstract
As a result of the efforts of the Open Data movements, the number of Open Data portals and the amount of data published in them is steadily increasing. An aspect that increases the utilizability of data enormously but is nevertheless often neglected is the enrichment of data with textual data documentation. However, the creation of descriptions of sufficient quality is time-consuming and thus cost-intensive. One approach to solving this problem is Data to text generation which creates descriptions to raw data. In the past, promising results were achieved on data from Wikipedia. Based on a seq2seq model developed for such purposes, we investigate whether this technique can also be applied in the Open Data domain and the associated challenges. In three studies, we reproduce the results obtained from a previous work and apply them to additional datasets with new challenges in terms of data nature and data volume. We can conclude that previous methods are not suitable to be applied in the Open Data sector without further modification, but the results still exceed our expectations and show the potential of applicability.
DownloadPaper Citation
in Harvard Style
Burgdorf A., Barkmann M., Pomp A. and Meisen T. (2022). Domain-independent Data-to-Text Generation for Open Data. In Proceedings of the 11th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-583-8, pages 95-106. DOI: 10.5220/0011272900003269
in Bibtex Style
@conference{data22,
author={Andreas Burgdorf and Micaela Barkmann and André Pomp and Tobias Meisen},
title={Domain-independent Data-to-Text Generation for Open Data},
booktitle={Proceedings of the 11th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2022},
pages={95-106},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011272900003269},
isbn={978-989-758-583-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 11th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Domain-independent Data-to-Text Generation for Open Data
SN - 978-989-758-583-8
AU - Burgdorf A.
AU - Barkmann M.
AU - Pomp A.
AU - Meisen T.
PY - 2022
SP - 95
EP - 106
DO - 10.5220/0011272900003269