
6 CONCLUSION
In this paper, we explored alternatives for legal text
classification using In-Context Learning with Sabi
´
a,
an LLM fine-tuned for Brazilian Portuguese. To
achieve this, we used two corpora of Brazilian legal
entities, LeNER-Br and UlyssesNER-Br, which en-
compass both general entities and those specific to
the legal domain. Various retrieval strategies were
applied to identify suitable examples in the ICL con-
text. In addition, we employed post-processing tech-
niques to eliminate noise and irrelevant labels from
the dataset. In summary, we found that selecting the
top K examples is the best heuristic in the general con-
text. Filtering techniques also play an important role
in obtaining final results.
We also explored the delved analysis of the model
predictions. Along with classification errors, we iden-
tified inherent complexities in legal language, which
pose challenges for models trained on general do-
main data. Notably, we observed instances of mis-
annotation that the ICL-enhanced model could clas-
sify accurately. Moreover, the model exhibited par-
tial classifications, correctly assigning categories to
terms but overlooking the context within which the
term operates, leading to a different expected classi-
fication. Additionally, our investigation revealed in-
stances of term ambiguities, indicating discrepancies
between the predictions and annotations of the model.
Despite these discrepancies, both the model and the
annotations presented plausible results.
For both corpora, we achieved an F1-Score result
of 51% with the best metrics associated with retriev-
ing the most similar examples. These experiments
underscored the potential of NER when using LLMs.
However, there is a need for further research in this
area. Our future work will explore additional heuris-
tics for retrieving relevant documents, experiment-
ing with different prompt templates, and leveraging
domain-specific knowledge to enhance predictive ac-
curacy.
ACKNOWLEDGMENTS
This study was partially funded by the Brazilian
funding agencies Coordenac¸
˜
ao de Aperfeic¸oamento
de Pessoal de N
´
ıvel Superior (CAPES) - Finance
Code 001 and Conselho Nacional de Desenvolvi-
mento Cient
´
ıfico e Tecnol
´
ogico (CNPq).
REFERENCES
Ahuja, K., Hada, R., Ochieng, M., Jain, P., Diddee, H.,
Maina, S., Ganu, T., Segal, S., Axmed, M., Bali, K.,
et al. (2023). Mega: Multilingual evaluation of gener-
ative ai. arXiv preprint arXiv:2303.12528.
Albuquerque, H. O., Costa, R., Silvestre, G., Souza, E.,
da Silva, N. F., Vit
´
orio, D., Moriyama, G., Martins,
L., Soezima, L., Nunes, A., et al. (2022). Ulyssesner-
br: a corpus of brazilian legislative documents for
named entity recognition. In International Conference
on Computational Processing of the Portuguese Lan-
guage, pages 3–14. Springer.
Amatriain, X., Sankar, A., Bing, J., Bodigutla, P. K., Hazen,
T. J., and Kazi, M. (2023). Transformer models: an
introduction and catalog.
Barale, C., Rovatsos, M., and Bhuta, N. (2023). Do lan-
guage models learn about legal entity types during
pretraining? arXiv preprint arXiv:2310.13092.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D.,
Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G.,
Askell, A., et al. (2020). Language models are few-
shot learners. Advances in neural information pro-
cessing systems, 33:1877–1901.
Choi, E., Levy, O., Choi, Y., and Zettlemoyer, L.
(2018). Ultra-fine entity typing. arXiv preprint
arXiv:1807.04905.
Cui, L., Wu, Y., Liu, J., Yang, S., and Zhang, Y. (2021).
Template-based named entity recognition using bart.
arXiv preprint arXiv:2106.01760.
Dai, H., Song, Y., and Wang, H. (2021). Ultra-fine entity
typing with weak supervision from a masked language
model. arXiv preprint arXiv:2106.04098.
Dong, Q., Li, L., Dai, D., Zheng, C., Wu, Z., Chang, B.,
Sun, X., Xu, J., and Sui, Z. (2022). A survey for in-
context learning. arXiv preprint arXiv:2301.00234.
Epure, E. V. and Hennequin, R. (2022). Probing pre-trained
auto-regressive language models for named entity typ-
ing and recognition. In Proceedings of the Thir-
teenth Language Resources and Evaluation Confer-
ence, pages 1408–1417.
Fei, Y., Hou, Y., Chen, Z., and Bosselut, A. (2023). Mitigat-
ing label biases for in-context learning. arXiv preprint
arXiv:2305.19148.
Feng, Y., Qiang, J., Li, Y., Yuan, Y., and Zhu, Y. (2023).
Sentence simplification via large language models.
arXiv preprint arXiv:2302.11957.
Gupta, S., Singh, S., and Gardner, M. (2023). Coverage-
based example selection for in-context learning. arXiv
preprint arXiv:2305.14907.
Huang, J., Li, C., Subudhi, K., Jose, D., Balakrishnan,
S., Chen, W., Peng, B., Gao, J., and Han, J. (2020).
Few-shot named entity recognition: A comprehensive
study. arXiv preprint arXiv:2012.14978.
Jiang, A. Q., Sablayrolles, A., Mensch, A., Bamford,
C., Chaplot, D. S., de las Casas, D., Bressand, F.,
Lengyel, G., Lample, G., Saulnier, L., Lavaud, L. R.,
Lachaux, M.-A., Stock, P., Scao, T. L., Lavril, T.,
Wang, T., Lacroix, T., and Sayed, W. E. (2023). Mis-
tral 7b.
ICEIS 2024 - 26th International Conference on Enterprise Information Systems
488