Towards Improving Translation Ability of Large Language Models on Low Resource Languages

Amulya Ratna Dash, Yashvardhan Sharma

2025

Abstract

With advancements in Natural Language Processing (NLP) and Large Language Models (LLMs), there is a growing need to understand their capabilities with low resource languages. This study focuses on benchmarking and improving the machine translation ability of LLMs for low resource Indic languages. We analyze the impact of training dataset sizes and overfitting by training for additional epochs on translation quality. We use LLaMA-3 as the base model and propose a simple resource efficient model finetuning approach which improves the zero-shot translation performance consistently across eight translation directions.

Download


Paper Citation


in Harvard Style

Dash A. and Sharma Y. (2025). Towards Improving Translation Ability of Large Language Models on Low Resource Languages. In Proceedings of the 14th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM; ISBN 978-989-758-730-6, SciTePress, pages 801-807. DOI: 10.5220/0013319000003905


in Bibtex Style

@conference{icpram25,
author={Amulya Dash and Yashvardhan Sharma},
title={Towards Improving Translation Ability of Large Language Models on Low Resource Languages},
booktitle={Proceedings of the 14th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM},
year={2025},
pages={801-807},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013319000003905},
isbn={978-989-758-730-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM
TI - Towards Improving Translation Ability of Large Language Models on Low Resource Languages
SN - 978-989-758-730-6
AU - Dash A.
AU - Sharma Y.
PY - 2025
SP - 801
EP - 807
DO - 10.5220/0013319000003905
PB - SciTePress