SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature

Vinícius Di Oliveira; Vinícius Di Oliveira; Yuri Bezerra; Li Weigang; Pedro Brom; Pedro Brom; Victor Celestino

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature

Topics: Generative AI Application Development and LLM Engineering; Natural Language Processing

In Proceedings of the 20th International Conference on Web Information Systems and Technologies WEBIST - Volume 1, 234-241, 2024 , Porto, Portugal

Authors: Vinícius Di Oliveira ^{1

;

2} ; Yuri Bezerra ² ; Li Weigang ² ; Pedro Brom ^{2

;

3} and Victor Celestino ⁴

Affiliations: ¹ Secretary of Economy, Brasilia, Federal District, Brazil ; ² TransLab, University of Brasilia, Brasilia, Federal District, Brazil ; ³ Federal Institute of Brasilia, Brasilia, Federal District, Brazil ; ⁴ LAMFO, Department of Administration, University of Brasilia, Brasilia, Federal District, Brazil

Keyword(s): Fine-Tuning, HS, Large Language Model, NCM, Portuguese Language, Retrieval Augmented Generation.

Abstract: Natural language processing (NLP) has seen significant advancements with the advent of large language models (LLMs). However, substantial improvements are still needed for languages other than English, especially for specific domains like the applications of Mercosur Common Nomenclature (NCM), a Brazilian Harmonized System (HS). To address this gap, this study uses TeenyTineLLaMA, a foundational Portuguese LLM, as an LLM source to implement the NCM application processing. Additionally, a simplified Retrieval-Augmented Fine-Tuning (RAFT) technique, termed SLIM-RAFT, is proposed for task-specific fine-tuning of LLMs. This approach retains the chain-of-thought (CoT) methodology for prompt development in a more concise and streamlined manner, utilizing brief and focused documents for training. The proposed model demonstrates an efficient and cost-effective alternative for fine-tuning smaller LLMs, significantly outperforming TeenyTineLLaMA and ChatGPT-4 in the same task. Although the res earch focuses on NCM applications, the methodology can be easily adapted for HS applications worldwide. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 18.223.135.15

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Di Oliveira, V., Bezerra, Y., Weigang, L., Brom, P. and Celestino, V. (2024). SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature. In Proceedings of the 20th International Conference on Web Information Systems and Technologies - WEBIST; ISBN 978-989-758-718-4; ISSN 2184-3252, SciTePress, pages 234-241. DOI: 10.5220/0012943400003825

@conference{webist24,
author={Vinícius {Di Oliveira} and Yuri Bezerra and Li Weigang and Pedro Brom and Victor Celestino},
title={SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature},
booktitle={Proceedings of the 20th International Conference on Web Information Systems and Technologies - WEBIST},
year={2024},
pages={234-241},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012943400003825},
isbn={978-989-758-718-4},
issn={2184-3252},
}

TY - CONF

JO - Proceedings of the 20th International Conference on Web Information Systems and Technologies - WEBIST
TI - SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature
SN - 978-989-758-718-4
IS - 2184-3252
AU - Di Oliveira, V.
AU - Bezerra, Y.
AU - Weigang, L.
AU - Brom, P.
AU - Celestino, V.
PY - 2024
SP - 234
EP - 241
DO - 10.5220/0012943400003825
PB - SciTePress