
Y., Wang, J., Gao, X., Zhong, T., et al. (2024). Under-
standing llms: A comprehensive overview from train-
ing to inference. arXiv preprint arXiv:2401.02038.
Minaee, S., Mikolov, T., Nikzad, N., Chenaghlu, M.,
Socher, R., Amatriain, X., and Gao, J. (2024).
Large language models: A survey. arXiv preprint
arXiv:2402.06196.
Popovi
´
c, M. (2015). chrf: character n-gram f-score for auto-
matic mt evaluation. In Proceedings of the tenth work-
shop on statistical machine translation, pages 392–
395.
Rei, R., De Souza, J. G., Alves, D., Zerva, C., Farinha,
A. C., Glushkova, T., Lavie, A., Coheur, L., and Mar-
tins, A. F. (2022). Comet-22: Unbabel-ist 2022 sub-
mission for the metrics shared task. In Proceedings
of the Seventh Conference on Machine Translation
(WMT), pages 578–585.
Singh, H., Gupta, N., Bharadwaj, S., Tewari, D., and Taluk-
dar, P. (2024). IndicGenBench: A multilingual bench-
mark to evaluate generation capabilities of LLMs on
Indic languages. In Ku, L.-W., Martins, A., and
Srikumar, V., editors, Proceedings of the 62nd An-
nual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers), pages 11047–
11073, Bangkok, Thailand. Association for Computa-
tional Linguistics.
Team, G., Anil, R., Borgeaud, S., Wu, Y., Alayrac, J.-
B., Yu, J., Soricut, R., Schalkwyk, J., Dai, A. M.,
Hauth, A., et al. (2023). Gemini: a family of
highly capable multimodal models. arXiv preprint
arXiv:2312.11805.
Team, G., Mesnard, T., Hardin, C., Dadashi, R., Bhupati-
raju, S., Pathak, S., Sifre, L., Rivi
`
ere, M., Kale, M. S.,
Love, J., Tafti, P., Hussenot, L., Sessa, P. G., Chowd-
hery, A., Roberts, A., Barua, A., Botev, A., Castro-
Ros, A., Slone, A., H
´
eliou, A., Tacchetti, A., Bu-
lanova, A., Paterson, A., Tsai, B., Shahriari, B., Lan,
C. L., Choquette-Choo, C. A., Crepy, C., Cer, D., Ip-
polito, D., Reid, D., Buchatskaya, E., Ni, E., Noland,
E., Yan, G., Tucker, G., Muraru, G.-C., Rozhdestven-
skiy, G., Michalewski, H., Tenney, I., Grishchenko, I.,
Austin, J., Keeling, J., Labanowski, J., Lespiau, J.-B.,
Stanway, J., Brennan, J., Chen, J., Ferret, J., Chiu, J.,
Mao-Jones, J., Lee, K., Yu, K., Millican, K., Sjoesund,
L. L., Lee, L., Dixon, L., Reid, M., Mikuła, M., Wirth,
M., Sharman, M., Chinaev, N., Thain, N., Bachem,
O., Chang, O., Wahltinez, O., Bailey, P., Michel, P.,
Yotov, P., Chaabouni, R., Comanescu, R., Jana, R.,
Anil, R., McIlroy, R., Liu, R., Mullins, R., Smith,
S. L., Borgeaud, S., Girgin, S., Douglas, S., Pandya,
S., Shakeri, S., De, S., Klimenko, T., Hennigan, T.,
Feinberg, V., Stokowiec, W., hui Chen, Y., Ahmed, Z.,
Gong, Z., Warkentin, T., Peran, L., Giang, M., Fara-
bet, C., Vinyals, O., Dean, J., Kavukcuoglu, K., Hass-
abis, D., Ghahramani, Z., Eck, D., Barral, J., Pereira,
F., Collins, E., Joulin, A., Fiedel, N., Senter, E., An-
dreev, A., and Kenealy, K. (2024). Gemma: Open
models based on gemini research and technology.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi,
A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava,
P., Bhosale, S., et al. (2023). Llama 2: Open foun-
dation and fine-tuned chat models. arXiv preprint
arXiv:2307.09288.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.
(2017). Attention is all you need. Advances in neural
information processing systems, 30.
Vilar, D., Freitag, M., Cherry, C., Luo, J., Ratnakar, V.,
and Foster, G. (2023). Prompting palm for transla-
tion: Assessing strategies and performance. In Pro-
ceedings of the 61st Annual Meeting of the Associa-
tion for Computational Linguistics (Volume 1: Long
Papers), pages 15406–15427.
Xu, H., Kim, Y. J., Sharaf, A., and Awadalla, H. H. (2023).
A paradigm shift in machine translation: Boosting
translation performance of large language models.
arXiv preprint arXiv:2309.11674.
Towards Improving Translation Ability of Large Language Models on Low Resource Languages
807