Authors:
Rafael Oleques Nunes
;
Andre Spritzer
;
Carla Dal Sasso Freitas
and
Dennis Balreira
Affiliation:
Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil
Keyword(s):
In-Context Learning, Large Language Models, Named Entity Recognition, Legal Tech, LLama.
Abstract:
This paper explores the application of the In-Context Learning (ICL) paradigm for Named Entity Recognition (NER) within the Portuguese language legal domain. Identifying named entities in legal documents is complex due to the intricate nature of legal language and the specificity of legal terms. This task is important for a range of applications, from legal information retrieval to automated summarization and analysis. However, the manual annotation of these entities is costly due to the specialized knowledge required from legal experts and the large volume of documents. Recent advancements in Large Language Models (LLM) have led to studies exploring the use of ICL to improve the performance of Generative Language Models (GLMs). In this work, we used Sabiá, a Portuguese language LLM, to extract named entities within the legal domain. Our goal was to evaluate the consistency of these extractions and derive insights from the results. Our methodology involved using a legal-domain NER co
rpus as input and selecting specific samples for a prompting task. We then instructed the GLM to catalog its own NER corpus, which we compared with the original test examples. Our study examined various aspects, including context examples, selection strategies, heuristic methodologies, post-processing techniques, and quantitative and qualitative analyses across specific domain classes. Our results indicate promising directions for future research and applications in specialized domains.
(More)