loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Desiree Heim 1 ; 2 ; Christian Jilek 1 ; Adrian Ulges 3 and Andreas Dengel 1 ; 2

Affiliations: 1 Smart Data and Knowledge Services Department, German Research Center for Artificial Intelligence (DFKI), Germany ; 2 Department of Computer Science, University of Kaiserslautern-Landau (RPTU), Germany ; 3 Department DCSM, RheinMain University of Applied Sciences, Germany

Keyword(s): Knowledge Work Dataset Generator, Large Language Model, Configurability.

Abstract: The evaluation of support tools designed for knowledge workers is challenging due to the lack of publicly available, extensive, and complete data collections. Existing data collections have inherent problems such as incompleteness due to privacy-preserving methods and lack of contextual information. Hence, generating datasets can represent a good alternative, in particular, Large Language Models (LLM) enable a simple possibility of generating textual artifacts. Just recently, we therefore proposed a knowledge work dataset generator, called KnoWoGen. So far, the adherence of generated knowledge work documents to parameters such as document type, involved persons, or topics has not been examined. However, this aspect is crucial to examine since generated documents should reflect given parameters properly as they could serve as highly relevant ground truth information for training or evaluation purposes. In this paper, we address this missing evaluation aspect by conducting respective u ser studies. These studies assess the documents’ adherence to multiple parameters and specifically to a given domain parameter as an important, representative. We base our experiments on documents generated with KnoWoGen and use the Mistral-7B-Instruct model as LLM. We observe that in the given setting, the generated documents showed a high quality regarding the adherence to parameters in general and specifically to the parameter specifying the document’s domain. Hence, 75% of the given ratings in the parameter-related experiments received the highest or second-highest quality score which is a promising outcome for the feasibility of generating high-qualitative knowledge work documents based on given configurations. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.117.102.180

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Heim, D., Jilek, C., Ulges, A. and Dengel, A. (2025). Investigating the Configurability of LLMs for the Generation of Knowledge Work Datasets. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-737-5; ISSN 2184-433X, SciTePress, pages 821-828. DOI: 10.5220/0013184200003890

@conference{icaart25,
author={Desiree Heim and Christian Jilek and Adrian Ulges and Andreas Dengel},
title={Investigating the Configurability of LLMs for the Generation of Knowledge Work Datasets},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2025},
pages={821-828},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013184200003890},
isbn={978-989-758-737-5},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Investigating the Configurability of LLMs for the Generation of Knowledge Work Datasets
SN - 978-989-758-737-5
IS - 2184-433X
AU - Heim, D.
AU - Jilek, C.
AU - Ulges, A.
AU - Dengel, A.
PY - 2025
SP - 821
EP - 828
DO - 10.5220/0013184200003890
PB - SciTePress