Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mo-
hamed, A., Levy, O., Stoyanov, V., and Zettlemoyer,
L. (2020). BART: Denoising sequence-to-sequence
pre-training for natural language generation, transla-
tion, and comprehension. In Proc. of the Annual Meet-
ing of the Assoc. for Computational Linguistics, pages
7871–7880. ACL.
Li, Y., Su, H., Shen, X., Li, W., Cao, Z., and Niu, S.
(2017). DailyDialog: A Manually Labelled Multi-
turn Dialogue Dataset. Computing Research Repos-
itory (CoRR).
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D.,
Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov,
V. (2019). RoBERTa: A Robustly Optimized BERT
Pretraining Approach. Computing Research Reposi-
tory (CoRR), abs/1907.11692.
Liu, Z., Zhou, Y., Zhu, Y., Lian, J., Li, C., Dou, Z., Lian,
D., and Nie, J.-Y. (2024). Information Retrieval Meets
Large Language Models. In Proc. of the ACM Web
Conf. (WWW Companion), pages 1586–1589.
Mahadevkar, S. V., Patil, S., Kotecha, K., Soong, L. W.,
and Choudhury, T. (2024). Exploring AI-driven ap-
proaches for unstructured document analysis and fu-
ture horizons. Journal of Big Data, 11(1).
Mastropaolo, A., Scalabrino, S., Cooper, N., Nader Pala-
cio, D., Poshyvanyk, D., Oliveto, R., and Bavota, G.
(2021). Studying the usage of text-to-text transfer
transformer to support code-related tasks. In Proc. of
ICSE Conf., pages 336–347.
Nasution, A. H. and Onan, A. (2024). ChatGPT La-
bel: Comparing the Quality of Human-Generated and
LLM-Generated Annotations in Low-Resource Lan-
guage NLP Tasks. IEEE Access, 12:71876–71900.
Peña, A., Morales, A., Fierrez, J., Serna, I., Ortega-Garcia,
J., Puente, I., Córdova, J., and Córdova, G. (2023).
Leveraging Large Language Models for Topic Clas-
sification in the Domain of Public Affairs. Lecture
Notes in Computer Science, 14193 LNCS:20–33.
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.,
et al. (2018). Improving language understanding by
generative pre-training.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D.,
Sutskever, I., et al. (2019). Language models are un-
supervised multitask learners. OpenAI blog, 1(8):9.
Reimers, N. and Gurevych, I. (2019). Sentence-BERT:
Sentence embeddings using siamese BERT-networks.
Computing Research Repository (CoRR).
Rodríguez-Cantelar, M., Estecha-Garitagoitia, M., D’Haro,
L. F., Matía, F., and Córdoba, R. (2023). Automatic
Detection of Inconsistencies and Hierarchical Topic
Classification for Open-Domain Chatbots. Applied
Sciences (Switzerland), 13(16).
Russo, G., Stoehr, N., and Ribeiro, M. H. (2023). ACTI at
EVALITA 2023: Automatic Conspiracy Theory Iden-
tification Task Overview. In CEUR Workshop Proc.,
volume 3473. CEUR-WS.
Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019).
DistilBERT, a distilled version of BERT: smaller,
faster, cheaper and lighter. Computing Research
Repository (CoRR).
Schabus, D., Skowron, M., and Trapp, M. (2017). One Mil-
lion Posts: A Data Set of German Online Discussions.
In Proc. of the 40th SIGIR Conf., page 1241–1244.
ACM.
Socher, R., Perelygin, A., Wu, J. Y., Chuang, J., Manning,
C. D., Ng, A. Y., and Potts, C. (2013). Recursive
deep models for semantic compositionality over a sen-
timent treebank. In Proc. of EMNLP, pages 1631–
1642.
Stahlschmidt, S. and Stephen, D. (2020). Comparison of
Web of Science, Scopus and Dimensions databases.
Technical report, KB forschungspoolprojekt, DZHW
Hannover, Germany.
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux,
M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro,
E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E.,
and Lample, G. (2023). LLaMA: Open and Efficient
Foundation Language Models. Computing Research
Repository (CoRR).
Trust, P. and Minghim, R. (2023). Query-Focused Submod-
ular Demonstration Selection for In-Context Learning
in Large Language Models. In Proc. of the 31st Irish
AICS Conf.
Van Nooten, J., Kosar, A., De Pauw, G., and Daele-
mans, W. (2024). Advancing CSR Theme and Topic
Classification: LLMs and Training Enhancement In-
sights. In Proc. of FinNLP-KDF-ECONLP@LREC-
COLING, pages 292–305.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, L., and Polosukhin, I.
(2017). Attention is all you need. In Advances in Neu-
ral Information Processing Systems, volume 2017-
December, pages 5999–6009.
Williams, A., Nangia, N., and Bowman, S. (2018). A
Broad-Coverage Challenge Corpus for Sentence Un-
derstanding through Inference. In Walker, M., Ji,
H., and Stent, A., editors, Proc. of the Conf. of the
North American Chapter of the Assoc. for Compu-
tational Linguistics: Human Language Technologies,
volume 1, pages 1112–1122. ACL.
Wohlin, C. (2014). Guidelines for snowballing in system-
atic literature studies and a replication in software en-
gineering. In Proc. of the 18th EASE Conf. ACM.
Yu, P., Xu, H., Hu, X., and Deng, C. (2023). Leverag-
ing Generative AI and Large Language Models: A
Comprehensive Roadmap for Healthcare Integration.
Healthcare (Switzerland), 11(20).
Zhang, S., Roller, S., Goyal, N., Artetxe, M., Chen, M.,
Chen, S., Dewan, C., Diab, M., Li, X., Lin, X. V.,
et al. (2023). Opt: Open pre-trained transformer lan-
guage models, 2022. Computing Research Repository
(CoRR), 3:19–0.
Zhang, X., Zhao, J., and Lecun, Y. (2015). Character-level
convolutional networks for text classification. Com-
puting Research Repository (CoRR).
KDIR 2024 - 16th International Conference on Knowledge Discovery and Information Retrieval
146