4 DISCUSSION
KGs allow the representation of information from
websites into a machine understandable format and,
consequently, the exploitation of semantics, i.e., rela-
tions that connect entities in the KG with methods that
are closer to human thinking. The exploitation of se-
mantics can give great aid to question-answering sys-
tems, or to data-driven models trained on them. Also,
the translation of data into KGs automatically allows
the interconnection and re-usability of the translated
data. This is a great advantage, especially in the do-
main related KGs, as many systems can access and
use the data which are in the cloud of KGs.
In this paper we presented the Eurostat KG that
contains most of the information from the Eurostat
and OECD websites, such as information about arti-
cles and datasets, interconnections between them and
external sources, and information for various classi-
fications for the articles and datasets, among others.
We described how we developed the schema of the
ontology, how we captured the data that we used to
infer the schema, and how we populated the KG with
the aforementioned information.
The creation of the Eurostat KG offers the fol-
lowing: (i) Increases the discoverability and acces-
sibility of data available for analytical purposes, (ii)
Strengthens Eurostat position within the Commission
as a provider of statistical data and services for its in-
ternal users, and (iii) Improves the methods for ex-
tracting information from unstructured data sources –
especially data available on the web.
As for future work, we plan to create a visu-
alization mechanism that will project pieces of the
KG. Moreover, we will link the KG with more ex-
ternal knowledge, for instance from DBpedia and/or
ConceptNet, and furthermore with other knowledge
graphs e.g. from the EU Open Data portal
14
or to
extend the current KG with more knowledge coming
from related statistical agencies in Europe or world-
wide, the Euro SDMX Registry
15
or the RAMON
Metadata Server
16
.
ACKNOWLEDGEMENT
The NLP4StatRef project was funded from Eurostat
Framework Contract N° 2018.0088, Lot 1: Method-
ological support, in Specific contract N° 000068 -
NLP4StatRef: “Methodological support on advanced
14
https://data.europa.eu/en
15
https://webgate.ec.europa.eu/sdmxregistry/
16
https://ec.europa.eu/eurostat/ramon
methods for accessing, ingesting and linking textual
information using semantic analysis and natural lan-
guage processing”. We are grateful for the help and
feedback provided by the European Commission’s of-
ficers responsible for the project: M
´
aty
´
as M
´
esz
´
aros
(Eurostat), Jacopo Grazini (DG DIGIT), Jean-Marc
Museux (Eurostat) and Martin Karlberg (Eurostat).
REFERENCES
Amith, M., Fujimoto, K., Mauldin, R., and Tao, C.
(2020). Friend of a friend with benefits ontology
(foaf+): extending a social network ontology for pub-
lic health. BMC Medical Informatics and Decision
Making, 20(10):1–14.
Arp, R., Smith, B., and Spear, A. D. (2015). Building on-
tologies with basic formal ontology. Mit Press.
Bandrowski, A. et al. (2016). The ontology for biomedical
investigations. PloS one, 11(4):e0154556.
Capadisli, S., Auer, S., and Ngonga Ngomo, A.-C. (2015).
Linked sdmx data. Semantic Web, 6(2):105–112.
Cyganiak, R., Reynolds, D., and Tennison, J. (2014). The
rdf data cube vocabulary.
Franck, C., Manuel, S., Mauro, B., Francesco, A., and
Giuseppina, R. (2018). Modernstats standards sup-
porting the implementation and sharing of statistical
services.
Guha, R. V., Brickley, D., and Macbeth, S. (2016). Schema.
org: evolution of structured data on the web. Commu-
nications of the ACM, 59(2):44–51.
Iqbal, R. et al. (2013). An analysis of ontology engineering
methodologies: A literature review. Research jour-
nal of applied sciences, engineering and technology,
6(16):2993–3000.
Isaac, A. and Summers, E. (2009). Skos simple knowledge
organization system. Primer, World Wide Web Con-
sortium (W3C), 7.
Kendall, E. F. and McGuinness, D. L. (2019). Ontology
engineering. Synthesis Lectures on The Semantic Web:
Theory and Technology, 9(1):1–102.
Otte, J. N., Beverley, J., and Ruttenberg, A. (2022).
Bfo: Basic formal ontology. Applied ontology,
(Preprint):1–27.
Sembiring, J. and Uluwiyah, A. (2015). Data and meta-
data exchange design with sdmx format using web
service for interoperability statistical data. TELKOM-
NIKA Indonesian Journal of Electrical Engineering,
14(2):343–352.
Smith, B. et al. (2007). The obo foundry: coordinated evo-
lution of ontologies to support biomedical data inte-
gration. Nature biotechnology, 25(11):1251–1255.
Zheng, J., Harris, M. R., Masci, A. M., Lin, Y., Hero, A.,
Smith, B., and He, Y. (2014). Obcs: The ontology of
biological and clinical statistics. In Proc. Fifth Inter-
national Conf. on Biomedical Ontology, volume 1327.
Building Eurostat Knowledge Graph
135