process and information extraction techniques with-
out an extensive involvement of domain experts for
the validation of the extracted instances.
ACKNOWLEDGEMENTS
We would like to acknowledge the J
´
ozef Piłsudski
Institute of America for providing us with the rich
archival collections. Also, we would like to thank and
commemorate Marek Zieli
´
nski, Vice-President of the
Piłsudski Institute of America, for his invaluable con-
tribution to both the intellectual and practical side at
each stage of the work.
REFERENCES
Adnan, K. and Akbar, R. (2019). An analytical study of
information extraction from unstructured and multidi-
mensional big data. Journal of Big Data, 6(1):1–38.
Blomqvist, E., Hammar, K., and Presutti, V. (2016). En-
gineering ontologies with patterns-the extreme design
methodology. Ontology Engineering with Ontology
Design Patterns, (25):23–50.
Che, N., Chen, D., and Le, J. (2019). Entity recognition
approach of clinical documents based on self-training
framework. In Recent Developments in Intelligent
Computing, Communication and Devices, pages 259–
265. Springer.
de Araujo, D. A., Rigo, S. J., and Barbosa, J. L. V. (2017).
Ontology-based information extraction for juridical
events with case studies in brazilian legal realm. Arti-
ficial Intelligence and Law, 25(4):379–396.
Derczynski, L., Maynard, D., Rizzo, G., Van Erp, M., Gor-
rell, G., Troncy, R., Petrak, J., and Bontcheva, K.
(2015). Analysis of named entity recognition and link-
ing for tweets. Information Processing & Manage-
ment, 51(2):32–49.
Grau, B. C., Horrocks, I., Motik, B., Parsia, B., Patel-
Schneider, P., and Sattler, U. (2008). Owl 2: The next
step for owl. Journal of Web Semantics, 6(4):309–322.
Konys, A. (2018). Towards knowledge handling in
ontology-based information extraction systems. Pro-
cedia computer science, 126:2208–2218.
Martinez-Rodriguez, J. L., Hogan, A., and Lopez-Arevalo,
I. (2020). Information extraction meets the semantic
web: a survey. Semantic Web, (Preprint):1–81.
Pandolfo, L. and Pulina, L. (2017). Adnoto: A self-
adaptive system for automatic ontology-based anno-
tation of unstructured documents. In Benferhat, S.,
Tabia, K., and Ali, M., editors, Advances in Arti-
ficial Intelligence: From Theory to Practice - 30th
International Conference on Industrial Engineering
and Other Applications of Applied Intelligent Systems,
IEA/AIE 2017, Arras, France, June 27-30, 2017, Pro-
ceedings, Part I, volume 10350 of Lecture Notes in
Computer Science, pages 495–501. Springer.
Pandolfo, L., Pulina, L., and Adorni, G. (2016). A frame-
work for automatic population of ontology-based dig-
ital libraries. In Adorni, G., Cagnoni, S., Gori, M., and
Maratea, M., editors, AI*IA 2016: Advances in Arti-
ficial Intelligence - XVth International Conference of
the Italian Association for Artificial Intelligence, Gen-
ova, Italy, November 29 - December 1, 2016, Proceed-
ings, volume 10037 of Lecture Notes in Computer Sci-
ence, pages 406–417. Springer.
Pandolfo, L., Pulina, L., and Zielinski, M. (2017). To-
wards an ontology for describing archival resources.
In Adamou, A., Daga, E., and Isaksen, L., editors,
Proceedings of the Second Workshop on Humani-
ties in the Semantic Web (WHiSe II) co-located with
16th International Semantic Web Conference (ISWC
2017), Vienna, Austria, October 22, 2017, volume
2014 of CEUR Workshop Proceedings, pages 111–
116. CEUR-WS.org.
Pandolfo, L., Pulina, L., and Zielinski, M. (2018). Arkivo:
an ontology for describing archival resources. In
CILC, pages 112–116.
Pandolfo, L., Pulina, L., and Zielinski, M. (2019). Ex-
ploring semantic archival collections: The case of
piłsudski institute of america. In Manghi, P., Candela,
L., and Silvello, G., editors, Digital Libraries: Sup-
porting Open Science - 15th Italian Research Confer-
ence on Digital Libraries, IRCDL 2019, Pisa, Italy,
January 31 - February 1, 2019, Proceedings, volume
988 of Communications in Computer and Information
Science, pages 107–121. Springer.
Piskorski, J. and Yangarber, R. (2013). Information extrac-
tion: Past, present and future. In Multi-source, mul-
tilingual information extraction and summarization,
pages 23–49. Springer.
Riboni, D. and Bettini, C. (2011). Owl 2 modeling and
reasoning with complex human activities. Pervasive
and Mobile Computing, 7(3):379–395.
Wimalasuriya, D. C. and Dou, D. (2010). Ontology-based
information extraction: An introduction and a survey
of current approaches.
Zamazal, O. (2020). A survey of ontology benchmarks for
semantic web ontology tools. International Journal
on Semantic Web and Information Systems (IJSWIS),
16(1):47–68.
ARKIVO Dataset: A Benchmark for Ontology-based Extraction Tools
345