significantly enhance performance, while unbalanced
training demonstrates superior results compared to
balanced training, underscoring the importance of
natural class distribution in training datasets. This
research contributes to understanding the automated
classification of domain entities and has implications
for developing more effective ontology-based classi-
fication systems. For future work, we aim to explore
using other advanced language models in the classifi-
cation pipeline, like Llama and Mixtral.
ACKNOWLEDGMENTS
Research supported by Higher Education Person-
nel Improvement Coordination (CAPES), code 0001,
Brazilian National Council for Scientific and Techno-
logical Development (CNPq), and Petrobras.
REFERENCES
Arp, R., Smith, B., and Spear, A. D. (2015). Building on-
tologies with basic formal ontology. Mit Press.
Buttigieg, P. L., Morrison, N., Smith, B., Mungall, C. J.,
Lewis, S. E., and Consortium, E. (2013). The en-
vironment ontology: contextualising biological and
biomedical entities. Journal of biomedical semantics,
4:1–9.
Degtyarenko, K., De Matos, P., Ennis, M., Hastings, J.,
Zbinden, M., McNaught, A., Alc
´
antara, R., Dar-
sow, M., Guedj, M., and Ashburner, M. (2007).
Chebi: a database and ontology for chemical enti-
ties of biological interest. Nucleic acids research,
36(suppl
1):D344–D350.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2018). Bert: Pre-training of deep bidirectional trans-
formers for language understanding. arXiv preprint
arXiv:1810.04805.
Francis, W. N. (1965). A standard corpus of edited present-
day american english. College English, 26(4):267–
273.
Gangemi, A., Navigli, R., and Velardi, P. (2003). The on-
towordnet project: Extension and axiomatization of
conceptual relations in wordnet. In Meersman, R.,
Tari, Z., and Schmidt, D. C., editors, On The Move
to Meaningful Internet Systems 2003: CoopIS, DOA,
and ODBASE, pages 820–838, Berlin, Heidelberg.
Springer Berlin Heidelberg.
Jackson, R., Matentzoglu, N., Overton, J. A., Vita, R., Bal-
hoff, J. P., Buttigieg, P. L., Carbon, S., Courtot, M.,
Diehl, A. D., Dooley, D. M., et al. (2021). Obo
foundry in 2021: operationalizing open data princi-
ples to evaluate ontologies. Database, 2021.
Jaiswal, P., Avraham, S., Ilic, K., Kellogg, E. A., McCouch,
S., Pujar, A., Reiser, L., Rhee, S. Y., Sachs, M. M.,
Schaeffer, M., et al. (2005). Plant ontology (po): a
controlled vocabulary of plant structures and growth
stages. Comparative and functional genomics, 6(7-
8):388–397.
Jiang, A. Q., Sablayrolles, A., Roux, A., Mensch, A.,
Savary, B., Bamford, C., Chaplot, D. S., Casas, D.
d. l., Hanna, E. B., Bressand, F., et al. (2024). Mixtral
of experts. arXiv preprint arXiv:2401.04088.
Jullien, M., Valentino, M., and Freitas, A. (2022). Do
transformers encode a foundational ontology? prob-
ing abstract classes in natural language. arXiv preprint
arXiv:2201.10262.
Lopes, A., Carbonera, J., Schmidt, D., and Abel, M. (2022).
Predicting the top-level ontological concepts of do-
main entities using word embeddings, informal defi-
nitions, and deep learning. Expert Systems with Ap-
plications, 203:117291.
Lopes, A., Carbonera, J., Schmidt, D., Garcia, L., Ro-
drigues, F., and Abel, M. (2023). Using terms
and informal definitions to classify domain entities
into top-level ontology concepts: An approach based
on language models. Knowledge-Based Systems,
265:110385.
Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., and
Chadha, A. (2024). A systematic survey of prompt
engineering in large language models: Techniques and
applications. arXiv preprint arXiv:2402.07927.
Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W.,
Ceusters, W., Goldberg, L. J., Eilbeck, K., Ireland, A.,
Mungall, C. J., et al. (2007). The obo foundry: coor-
dinated evolution of ontologies to support biomedical
data integration. Nature biotechnology, 25(11):1251–
1255.
Studer, R., Benjamins, V. R., and Fensel, D. (1998). Knowl-
edge engineering: Principles and methods. Data &
knowledge engineering, 25(1-2):161–197.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi,
A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava,
P., Bhosale, S., Bikel, D., Blecher, L., Ferrer, C. C.,
Chen, M., Cucurull, G., Esiobu, D., Fernandes, J., Fu,
J., Fu, W., Fuller, B., Gao, C., Goswami, V., Goyal,
N., Hartshorn, A., Hosseini, S., Hou, R., Inan, H.,
Kardas, M., Kerkez, V., Khabsa, M., Kloumann, I.,
Korenev, A., Koura, P. S., Lachaux, M.-A., Lavril, T.,
Lee, J., Liskovich, D., Lu, Y., Mao, Y., Martinet, X.,
Mihaylov, T., Mishra, P., Molybog, I., Nie, Y., Poul-
ton, A., Reizenstein, J., Rungta, R., Saladi, K., Schel-
ten, A., Silva, R., Smith, E. M., Subramanian, R.,
Tan, X. E., Tang, B., Taylor, R., Williams, A., Kuan,
J. X., Xu, P., Yan, Z., Zarov, I., Zhang, Y., Fan, A.,
Kambadur, M., Narang, S., Rodriguez, A., Stojnic, R.,
Edunov, S., and Scialom, T. (2023). Llama 2: Open
foundation and fine-tuned chat models.
ICEIS 2024 - 26th International Conference on Enterprise Information Systems
148