sis as inputs for the process of Ontology Knowledge
Discovery, showing how we query the ontology to
finding the concepts and properties represented in Ar-
chOnto that are related to the terms found at morpho-
logical and syntactic analysis level. The representa-
tion of different CIDOC-CMR concepts such as per-
sons, events, times, locations and documents is illus-
trated highlighting the migration process of baptism
document.
Future work includes the evaluation of the Ar-
chOnto population quality. Traditionally this evalua-
tion is done recurring to precision and recall measures
as in Information Retrieval. However, for ontology’s
population the binary classification, yes/no, does not
take into account cases where the information is par-
tially captured, methods like Balanced Distance Met-
ric(Maynard et al., 2008) are more adequate to our
evaluation task. The EPISA project has the human re-
sources that will enable us to embrace this task. The
extension of the prototype to other documents types
will bring new challenges that still need good solu-
tions to be solved, for instance when a new person
shares some proprieties with a known person, shall
we consider that it is the same person? when a person
has the same names for its parents and grand parents
of a person known in ArchOnto, are they brothers?
should that relation be considered as a extension of
ArchOnto since it does not exists in CIDOC-CRM?
ACKNOWLEDGEMENTS
This work is financed by National Funds through
the Portuguese funding agency, FCT (Fundac¸
˜
ao
para a Ci
ˆ
encia e a Tecnologia) within project
DSAIPA/DS/0023/2018.
REFERENCES
Bick, E. (2014). Palavras, a constraint grammarbased pars-
ing system for portuguese. Working with portuguese
corpora, pages 279–302.
Bountouri, L. and Gergatsoulis, M. (2011). The semantic
mapping of archival metadata to the cidoc crm ontol-
ogy. Journal of Archival Organization, 9(3-4):174–
207.
Cavalcanti, M. C., Pereira, F. D., Fusco, E., and Mucheroni,
M. L. (2017). Model of data extraction in the innova-
tion environments of the state of s
˜
ao paulo based on
semantic technologies. In International Conference
on Information Systems & Technology Management.
de Almeida, M. J. and Runa, L. (2018). ICON Project: Con-
tent integration in Portuguese National Archives using
CIDOC-CRM. In 2018 CIDOC Annual Conference.
di Buono, M. P., Monteleone, M., and Elia, A. (2014).
How to populate ontologies. In M
´
etais, E., Roche,
M., and Teisseire, M., editors, Natural Language Pro-
cessing and Information Systems, pages 55–58, Cham.
Springer International Publishing.
Fragkou, P., Kritikos, N., and Galiotou, E. (2016). Querying
greek governmental site using sparql. In Proceedings
of the 20th Pan-Hellenic Conference on Informatics,
PCI ’16, New York, NY, USA. Association for Com-
puting Machinery.
ICOM/CIDOC-CRM Special Interest Group (2019). Def-
inition of the CIDOC Conceptual Reference Model.
ICOM, 6.2.7 edition.
International Council on Archives (2000). ISAD(G) Second
Edition. International Council on Archives.
Koch, I., Freitas, N., Ribeiro, C., Lopes, C. T., and da Silva,
J. R. (2019). Knowledge graph implementation of
archival descriptions through cidoc-crm. In Doucet,
A., Isaac, A., Golub, K., Aalberg, T., and Jatowt, A.,
editors, Digital Libraries for Open Knowledge, pages
99–106, Cham. Springer International Publishing.
Koch, I., Ribeiro, C., and Lopes, C. T. (2020). Archonto, a
cidoc-crm-based linked data model for the portuguese
archives. In 24nd International Conference on Theory
and Practice of Digital Libraries, TPDL.
Makki, J. (2017). Ontoprima: A prototype for automat-
ing ontology population. International Journal of
Web/Semantic Technology (IJWesT), 8.
Makki, J., Alquier, A.-M., and Prince, V. (2008). An
nlp-based ontology population for a risk management
generic structure. In Proceedings of the 5th Inter-
national Conference on Soft Computing as Transdis-
ciplinary Science and Technology, CSTST ’08, page
350–355, New York, NY, USA. Association for Com-
puting Machinery.
Maynard, D., Li, Y., and Peters, W. (2008). Nlp techniques
for term extraction and ontology population.
Meghini, C. and Doerr, M. (2018). A first-order logic ex-
pression of the cidoc conceptual reference model. In-
ternational Journal of Metadata, Semantics and On-
tologies, 13(2):131–149.
Oldman, D. (2014). The CIDOC Conceptual Reference
Model (CIDOC-CRM): PRIMER. CRM Labs.
Ramalho, J. C. and Ferreira, J. C. (2004). Digitarq: cre-
ating and managing a digital archive. In Building
Digital Bridges: Linking Cultures, Commerce and
Science: 8th ICCC/IFIP International Conference on
Electronic Publishing held in Bras
´
ılia - ELPUB 2004,
Brasilia, Brazil, June 23-26, 2004. Proceedings.
Scifleet, P. (2001). International standard archival descrip-
tion isad (g).
Vitali, S. (2004). Authority control of creators and the
second edition of isaar (cpf), international standard
archival authority record for corporate bodies, per-
sons, and families. Cataloging & classification quar-
terly, 38(3-4):185–199.
KEOD 2020 - 12th International Conference on Knowledge Engineering and Ontology Development
204