(Bronstein et al., 2021). Based on this, GNN link pre-
diction models allow to combine node features with
graph topology.
In this study we will use GNN link prediction
models to find missing links in knowledge graphs.
Finding missing links for knowledge graphs helps to
solve numerous problems, in particular knowledge
graph incompleteness. Also adding links to knowl-
edge graphs allows to detect unknown relationships
between graph nodes.
Experiments of our previous knowledge graph
study (Romanova, 2020) were based on finding un-
known relationships between modern art artists. As
data for experiments we used artist biographies,
known relationships between artists and data about
modern art movements. For experiments of this study
we will use Wikipedia articles about the same 20
modern art artists (please see Table 1).
We will examine two different scenarios: one
scenario is based on artist names and full text of
Wikipedia articles and another scenario is based on
distribution of co-located words within and across the
articles.
For the first scenario we will build initial knowl-
edge graph on artist names and Wikipedia text as
nodes and relationships between artists and corre-
sponding articles as edges. Then we will embed node
features through transformer models and generate ad-
ditional edges for artist pairs if their corresponding
Wikipedia article vectors will have high cosine sim-
ilarities. Modified knowledge graph will be used as
input data to GNN link prediction model.
For the second scenario we will build initial
knowledge graph with nodes as pairs of co-located
words and edges as pairs of nodes with common
words. That knowledge graph will represent not
only word sequences within articles but also chains
of words across Wikipedia articles about different
artists.
After running GNN link prediction models on
top of both knowledge graphs, we will rewire ini-
tial knowledge graphs through similarities of re-
embedded nodes.
In this paper we will demonstrate the following:
• Describe related work.
• Examine raw data analysis.
• Describe methods of data preparation, model
training and interpreting model results.
• Explain in different scenarios how to rewire
knowledge graphs based on interpreting the model
results.
• Illustrate applications of highly similar and highly
dissimilar artist pairs for recommender systems.
• Emphasize that pairs of dissimilar nodes provide
for graph mining quite different values that pairs
of similar nodes.
2 RELATED WORK
After it was introduced by Google, knowledge graph
was adapted by many companies as a powerful way to
integrate and search various data such as structured,
unstructured or semi-structured data taken from a va-
riety of sources. Knowledge graphs combine internal
data with public knowledge, drive a variety of data
products and make them more intelligent (Noy et al.,
2019).
Knowledge graph organizes various data types
and data volumes to highlight relationships between
data points. Relationship is one of the main reasons
of knowledge graph popularity but in practice in ex-
isting knowledge graphs it is often incomplete.
Also real-world data are often dynamic and evolv-
ing, which leads to difficulty in constructing correct
and complete knowledge graphs and it is a challeng-
ing task to automatically construct complete dynamic
knowledge graphs. Link prediction is one of ways to
solve these challenging problems (Wang et al., 2021).
Link prediction is a fundamental problem that at-
tempts to estimate a likelihood of existence of a link
between two nodes, which makes it easier to under-
stand associations between two specific nodes and
how the entire network evolves (Wu et al., 2022). The
problem of link prediction over complex networks can
be categorized into two classes. One is to reveal the
missing links. The other is to predict the links that
may exist in the future as the network evolves.
Various types of link predictions has been widely
applied to a variety of fields. In social networks link
predictions support potential collaborations and help
to find assistants. In biology and medicine link pre-
dictions provide ability to foresee hidden associations
like protein–protein interactions. (Zhou, 2021).
In recent years, link predictions are extensively
used in social networks, citation networks, biologi-
cal networks, recommender systems, security and so
on and link prediction models attract more and more
studies.
Before GNN became an emerging research area
link prediction techniques were based either on graph
topology or on node features (Zhou et al., 2009).
There has been a surge of algorithms that make
link prediction through representation learning that
learns low dimensional embeddings such as Deep-
Walk (Grover and Leskovec, 2016), node2vec (Per-
ozzi et al., 2014), etc. Over the years many link
ICAART 2023 - 15th International Conference on Agents and Artificial Intelligence
150