{ ?lastWill rdf:type onto:CollectImportantDocuments
?lastWill onto:SearchProcedure ?searchLastWill.
FILTER (?searchLastWill = onto:lastWill) }
This query was then formatted into an appropriate
formal structure and executed at the ontology level,
producing the relevant results associated with the
concerned subgraphs of the ontology.
4 DISCUSSION AND
PERSPECTIVES
It is very challenging, with existing models, to
accurately predict a specific class of need, especially
when the dataset contains several hundred classes of
needs. However, our approach yields better results for
user needs classification, thus making a significant
contribution to current research. Additionally, we
extended our experiments by deploying a chatbot to
demonstrate how these tested approaches are utilized
in real-time.
In the context of our study, the comparative
analysis of the performance of the two considered
models, namely T5 and GPT-2, has highlighted
significant differences. The results reveal that the T5
model stands out for its remarkable robustness,
displaying an impressive score of 97.22%. This
outstanding performance remains consistent across
various evaluated parameters, validated across an
exhaustive range of training tests. In contrast, the
GPT-2 model presents less conclusive results, with an
overall score of 84%, while showing comparatively
lower relative stability. Notably, achieving this level
of performance with GPT-2 requires a substantial
time investment, as up to 16 training epochs are
needed to stabilize its performance, compared to the
8 epochs required by T5 to achieve optimal stability.
These empirical observations attest to the superior
ability of the T5 model to effectively handle our
classification task, by adapting appropriately to the
inherent structure of our dataset.
We tested our dataset on a relatively small corpus,
aware of the advantages that a richer dataset can
provide. With the aim of further enriching this data,
we are considering integrating large language models
(LLMs), such as GPT-4. Integrating such models will
not only enhance the quality of predictions but also
generate relevant and fluent responses by leveraging
more diverse data during training. Thus, enriching our
dataset and adopting more advanced models will help
us better address the exponential needs of users and
strengthen the robustness of our classification system.
5 CONCLUSIONS
In the scope of our study, we investigated the use of a
transformer encoder-decoder model to translate
natural language sentences into formal SPARQL
queries capable of interacting with an ontology. This
approach aims to reduce the complexity of SPARQL
language, often used to query RDF ontologies,
thereby making these interactions more accessible to
users unfamiliar with the syntax of this language.
To accomplish this, we tested two transformer
models, T5 and GPT-2, on a dataset composed of
pairs of questions and corresponding SPARQL
queries, specifically created for this task. The results
obtained demonstrate that the T5 model offers
superior performance, with a score of 97.22%, and
increased stability across various evaluated
parameters, requiring only 8 epochs to achieve this
stability. In comparison, the GPT-2 model showed
lower stability, with a score of 84%, and required up
to 16 epochs to achieve comparable results. These
findings underscore the robustness and efficiency of
the T5 model for generating SPARQL queries from
natural language questions, indicating its ability to
facilitate interaction with ontologies for non-expert
users.
REFERENCES
Aydin, B. I., Yilmaz, Y. S., Li, Y., Li, Q., Gao, J., &
Demirbas, M. (2014). Crowdsourcing for Multiple-
Choice Question Answering. In Proceedings of the
AAAI Conference on Artificial Intelligence.
Bagan, G., Bonifati, A., Ciucanu, R., Fletcher, G. H. L.,
Lemay, A., & Advokaat, N. (2017). GMark: Schema-
driven generation of graphs and queries. IEEE
Transactions on Knowledge and Data Engineering,
29(4), 856-869.
Dileep, A. K., Mishra, A., Mehta, R., Uppal, S.,
Chakraborty, J., & Bansal, S. K. (2021). Template-
based Question-Answering analysis on the LC-
QuAD2.0 Dataset. In 2021 IEEE 15th International
Conference on Semantic Computing (ICSC) (pp. 443-
448). Laguna Hills, CA, USA.
doi:10.1109/ICSC50631.2021.00079
Dubey, M., Banerjee, D., Abdelkawi, A., & Lehmann, J.
(2019). LC-QuAD 2.0: A Large Dataset for Complex
Question Answering over Wikidata and DBpedia. In C.
Ghidini et al. (Eds.), The Semantic Web โ ISWC 2019.
ISWC 2019. Lecture Notes in Computer Science (Vol.
11779). Springer, Cham. https://doi.org/10.1007/978-
3-030-30796-7_5
Guo, Y., Pan, Z., & Heflin, J. (2005). LUBM: A benchmark
for OWL knowledge base systems. Journal of Web
Semantics, 3(2), 158-182.
Optimization of Methods for Querying Formal Ontologies in Natural Language Using a Neural Network
125