
REFERENCES
An, S., Ma, Z., Lin, Z., Zheng, N., and Lou, J.-G. (2024).
Make your llm fully utilize the context. arXiv preprint
arXiv:2404.16811.
Chen, X., Wang, L., Wu, W., Tang, Q., and Liu, Y. (2024).
Honest ai: Fine-tuning” small” language models to
say” i don’t know”, and reducing hallucination in rag.
arXiv preprint arXiv:2410.09699.
Di Oliveira, V., Bezerra, Y. F., Weigang, L., Brom, P. C., and
Celestino, V. R. (2024). Slim-raft: A novel fine-tuning
approach to improve cross-linguistic performance for
mercosur common nomenclature. In WEBIST.
Fu, J., Ng, S.-K., Jiang, Z., and Liu, P. (2024). Generative
context distillation. arXiv preprint arXiv:2411.15927.
Gogate, N. et al. (2024). Reducing llm hallucination using
knowledge distillation: A case study with mistral large
and mmlu benchmark. TechRxiv.
Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang,
S., Wang, L., and Chen, W. (2021). Lora: Low-rank
adaptation of large language models. arXiv preprint
arXiv:2106.09685.
Hu, S., Tu, Y., Han, X., et al. (2024). Minicpm: Unveiling
the potential of small language models with scalable
training strategies. arXiv preprint arXiv:2404.06395.
Jin, H., Han, X., Yang, J., Jiang, Z., Liu, Z., Chang, C.-
Y., Chen, H., and Hu, X. (2024). Llm maybe longlm:
Self-extend llm context window without tuning. arXiv
preprint arXiv:2401.01325.
Khattab, O., Singhvi, A., Maheshwari, P., Zhang, Z., San-
thanam, K., Vardhamanan, S., Haq, S., Sharma, A.,
Joshi, T. T., Moazam, H., Miller, H., Zaharia, M., and
Potts, C. (2023). Dspy: Compiling declarative lan-
guage model calls into self-improving pipelines. arXiv
preprint arXiv:2310.03714.
Lin, B., Zhang, C., Peng, T., Zhao, H., Xiao, W., Sun,
M., Liu, A., Zhang, Z., Li, L., Qiu, X., et al. (2024).
Infinite-llm: Efficient llm service for long context with
distattention and distributed kvcache. arXiv preprint
arXiv:2401.02669.
Mirzadeh, I., Alizadeh, K., Shahrokhi, H., Tuzel, O., Ben-
gio, S., and Farajtabar, M. (2024). Gsm-symbolic:
Understanding the limitations of mathematical rea-
soning in large language models. arXiv preprint
arXiv:2410.05229.
Yang, Z., Qi, P., Zhang, S., Bengio, Y., Cohen,
W. W., Salakhutdinov, R., and Manning, C. D.
(2018). Hotpotqa: A dataset for diverse, explain-
able multi-hop question answering. arXiv preprint
arXiv:1809.09600.
Yao, B., Zhang, Y., Li, Q., and Qin, J. (2024). Is sarcasm de-
tection a step-by-step reasoning process in large lan-
guage models? arXiv preprint arXiv:2407.12725.
Zhang, T., Patil, S. G., Jain, N., Shen, S., Zaharia, M., Sto-
ica, I., and Gonzalez, J. E. (2024a). Raft: Adapting
language model to domain specific rag. arXiv preprint
arXiv:2403.10131.
Zhang, Y., Khalifa, M., Logeswaran, L., et al. (2024b).
Small language models need strong verifiers to self-
correct reasoning. arXiv preprint arXiv:2404.17140.
APPENDIX
This section presents examples of inferences drawn
from the experiments.
A.1 Distillation
The input(Q,C,A):
"""
Instruction: Given the question,
the context
and the expected answer bellow,
provide relevant quotes from the
context that support the answer.
your answer must be just the
quotes, not the entire context.
format:
##begin_quote##quote##end_quote##
for each quote.
do not add anything else other
than the quotes.
Your turn:
Question: Unlike Xuzhou, where is Rugao
under the adminstration of?
Context: Rugao () is a county-level city
under the administration of Nantong,
Jiangsu province, China, located
in [about 200 words...] Shanghai.
Answer: Nantong
Quotes:
""")
And the generated quotes(R):
"""
##begin_quote## Rugao () is a county-level
city under the administration of
Nantong ##end_quote##
"""
A.2 Quote Train Sample
The input(Q,C):
"""
Instruction: Given the question
and the context
provide relevant quotes from the
context that support the answer.
your answer must be just the
quotes, not the entire context.
format:
##begin_quote##quote##end_quote##
for each quote.
do not add anything else other than
the quotes.
Question: What authority manages
the regional
passenger train service that
runs through
the same junction as
West Amesbury Branch Railroad?
LLMQuoter: Enhancing RAG Capabilities Through Efficient Quote Extraction from Large Contexts
1341