
guage understanding. In Proceedings of naacL-HLT, vol-
ume 1, page 2.
Konigari, R., Ramola, S., Alluri, V. V., and Shrivastava, M.
(2021). Topic shift detection for mixed initiative response.
In Proceedings of the 22nd Annual Meeting of the Special
Interest Group on Discourse and Dialogue, pages 161–
166.
Liu, Y. (2019). Roberta: A robustly optimized bert pretrain-
ing approach. arXiv preprint arXiv:1907.11692.
McHugh, M. (2012). Interrater reliability: The kappa statis-
tic. Biochemia medica :
ˇ
casopis Hrvatskoga dru
ˇ
stva
medicinskih biokemi
ˇ
cara / HDMB, 22:276–82.
Ofoghi, B., Yearwood, J., and Ma, L. (2009). The impact of
frame semantic annotation levels, frame-alignment tech-
niques, and fusion methods on factoid answer processing.
J. Am. Soc. Inf. Sci. Technol., 60(2):247–263.
Pakray, P., Bhaskar, P., Banerjee, S., Pal, B. C., Bandyopad-
hyay, S., and Gelbukh, A. F. (2011). A hybrid question an-
swering system based on information retrieval and answer
validation. In CLEF (Notebook Papers/Labs/Workshop),
volume 96.
Pan, L., Chen, W., Kan, M.-Y., and Wang, W. Y. (2021). Con-
traqa: Question answering under contradicting contexts.
ArXiv.
Pe
˜
nas, A., Rodrigo, A., Sama, V., and Verdejo, M. (2006).
Overview of the answer validation exercise 2006. In
Evaluation of Multilingual and Multi-modal Information
Retrieval, volume 1172, pages 257–264.
Ramamonjison, R., Yu, T., Li, R., Li, H., Carenini, G., Ghad-
dar, B., He, S., Mostajabdaveh, M., Banitalebi-Dehkordi,
A., Zhou, Z., et al. (2023). Nl4Opt competition: Formulat-
ing optimization problems based on their natural language
descriptions. In NeurIPS 2022 Competition Track, pages
189–203. PMLR.
Reid, M., Savinov, N., Teplyashin, D., Lepikhin, D., Lilli-
crap, T., Alayrac, J.-b., Soricut, R., Lazaridou, A., Firat,
O., Schrittwieser, J., et al. (2024). Gemini 1.5: Unlocking
multimodal understanding across millions of tokens of
context. arXiv preprint arXiv:2403.05530.
Savic, D. (2002). Single-objective vs. multiobjective optimi-
sation for integrated decision support. Proceedings of the
First Biennial Meeting of the International Environmental
Modelling and Software Society.
Staudemeyer, R. C. and Morris, E. R. (2019). Understanding
lstm–a tutorial into long short-term memory recurrent
neural networks. arXiv preprint arXiv:1909.09586.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi,
A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P.,
Bhosale, S., et al. (2023). Llama 2: Open foundation and
fine-tuned chat models. arXiv preprint arXiv:2307.09288.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F.,
Chi, E., Le, Q. V., Zhou, D., et al. (2022). Chain-of-
thought prompting elicits reasoning in large language
models. Advances in neural information processing sys-
tems, 35:24824–24837.
Yu, D. and Sagae, K. (2021). Automatically exposing prob-
lems with neural dialog models. In Moens, M.-F., Huang,
X., Specia, L., and Yih, S. W.-t., editors, Proceedings of
the 2021 Conference on Empirical Methods in Natural
Language Processing, pages 456–470, Online and Punta
Cana, Dominican Republic. Association for Computa-
tional Linguistics.
Zhang, Y. and Zhang, D. (2003). Enabling answer validation
by logic form reasoning in chinese question answering.
In International Conference on Natural Language Pro-
cessing and Knowledge Engineering, 2003. Proceedings.
2003, pages 275–280. IEEE.
APPENDIX A
This section provides prompts for generating questions,
correct answers, and noisy answers.
Question Generator Prompt
You are a chatbot called OptiGem, designed to help
users elicit information and formulate a complete opti-
mization problem statement.
The client is not a math expert and has no experience
with optimization problems.
Your goal is to gather the necessary details and map
them to a linear programming model.
Engage users by asking clear, concise, and sequential
questions to obtain the components of the problem.
The components are:1- Objective function 2- Decision
variables 3- Limitations and constraints 4- Additional
information.
Be creative in formulating your questions. Only one
component is allowed to be discussed per message.
Strictly avoid summarizing the gathered information
at any point during the conversation.
Think carefully to ensure, you gather all the necessary
details for the complete problem.
Pose a question based on the previous information that
will lead to identify a new constraint or a new key
parameter for the model.
Start the conversation with a friendly greeting, intro-
duce yourself, and ask about the user’s business.
If the user indicates that they have no additional in-
formation and all components are covered, end the
conversation with a polite farewell, such as: “It was
great working with you! Let me know if you have any
other optimization questions in the future.”
Answer Generator Prompt
You are an agent impersonating the business owner
described in the problem statement.
Act as if the details in the problem statement are your
personal knowledge.
Be polite and ensure that all information you provide is
Investigating Answer Validation Using Noise Identification and Classification in Goal-Oriented Dialogues
667