
of RoleSim, we present the fundamental intuition and
various notations used throughout this paper.
In a directed graph G = (V, E), V and E denote
the vertices and edges in G, respectively. A node
u is an in-neighbor of node v if (u, v) ∈ E. Simi-
larly, an out-neighbor can be defined as a node that
has an outgoing edge to another node. The sets of in-
neighbors and out-neighbors of a node v in the graph
are denoted by I (v) and O(v), respectively. The in-
degree and out-degree of a node v in the graph rep-
resent the number of in-neighbors and out-neighbors
of v and are denoted by deg
−
v
and deg
+
v
, respectively.
Furthermore, mindeg
−
(u, v) is the smaller in-degree
between node u and node v, which can be expressed
mathematically as mindeg
−
(u, v) = min(deg
−
u
, deg
−
v
).
Similarly, maxdeg
−
(u, v) is the larger in-degree be-
tween node u and node v, i.e., maxdeg
−
(u, v) =
max(deg
−
u
, deg
−
v
). For example, consider the node
pair (S1, J1) in Figure 1. The in-degree of node
S1 is 2, and mindeg
−
(S1, J1) = min(2, 1) = 1 and
maxdeg
−
(S1, J1) = max(2, 1) = 2.
The RoleSim algorithm is founded upon the con-
cept of maximal matching of neighbors’ similarity,
which recursively establishes the similarity between
nodes as the mean similarity of the maximum weight
matching among their neighbors. Maximum Weighted
Matching (MWM) is a well-known problem in graph
theory where the objective is to find, in a weighted
graph, a matching that has the highest possible sum
of weights. The RoleSim algorithm calculates the role
similarity rs(u, v) between nodes u and v using the
following formula. The complete matrix of pairwise
similarity values between all nodes is referred to as R:
rs(u, v) = (1 −C) max
M A(u,v)
∑
(x,y)∈M A(u,v)
rs(x, y)
deg
−
u
+ deg
−
v
− mindeg
−
(u, v)
+C
(1)
Here, x ∈ I(u), y ∈ I(v), M A(u, v) denotes a
matching between I(u) and I(v), C signifies the decay
factor (0 < C < 1), and deg
−
u
+ deg
−
v
−mindeg
−
(u, v)
is equivalent to maxdeg
−
(u, v).
A weighted bipartite matching of M A (u, v) can
be defined using rs(x, y) scores as the weights. The
weight of the matching is given by the sum of the
rs(x, y) scores for all (x, y) pairs in M A(u, v), de-
noted by w(M A(u, v)). Mathematically, it means
w(M A(u, v)) =
∑
(x,y)∈M A (u,v)
rs(x, y). A matching
M A(u, v) is said to be maximal if its weight is the
maximum among all possible matchings, denoted as
f
M (u, v), and the weight of
f
M (u, v) is denoted by
M(u, v), i.e., M(u, v) = w(
f
M (u, v)).
Using the notation of
f
M (u, v) and M(u, v), the
definition of rs(u, v) from Equation 1 can also be ex-
pressed as follows (Rothe and Sch
¨
utze, 2014):
rs(u, v) = (1 −C)
M (u, v)
maxdeg
−
(u, v)
+C
(2)
The matching selection process used by RoleSim
is explained using the following example.
Example 3.1. Consider a directed graph G = (V, E),
where (u, v) ∈ V are two nodes. The set of in-
neighbors of node u is denoted as I(u) = {a, b, c},
while the set of in-neighbors of node v is denoted as
I(v) = {d, e, f , g, h} in G. A subset of the RoleSim
matrix of values (R) is presented in Figure 2, where
each value represents the similarity of the pairings of
neighbors between these two vertices. Assume that
these values have the following ordering: rs(a, d) =
max(rs(a, :)), rs(b, f ) = max(rs(b, :)), and rs(c, e) =
max(rs(c, :)).
In RoleSim, a matching involves selecting a single
cell from each row and column. When the number of
rows is different from the number of columns, the size
of the matching is limited to mindeg
−
(u, v). In this ex-
ample, the matching size is restricted to 3. A maximal
matching is a matching where the sum of the selected
cells is maximized. As depicted in Figure 2, follow-
ing the principle of maximum weighted matching, the
maximal matching results of the in-neighbor similar-
ity matrix are enclosed by a solid square and can be
expressed as M (u, v) = rs(a, d) + rs(b, f ) + rs(c, e).
In the subsequent sections of this paper, M
1
(u, v) will
be used to refer to this maximal weighted matching re-
sult of the in-neighbor similarity matrix for the node
pair (u, v) generated by the RoleSim algorithm, and
it is referred to as the first-order maximal weighted
matching result. This distinction is made to differen-
tiate it from the higher Γ
th
order maximal weighted
matching used in the proposed FaRS algorithm (Sec-
tion 5). For instance, M
2
(u, v) denotes the second-
largest weighted matching result.
Figure 2: In-Neighbour Similarity Matrix of Node-Pair
(u, v).
The RoleSim algorithm, which follows an iterative
process to calculate the role similarity score between
node pairs (u, v) ∈ V, consists of two phases. First,
the role similarity search scores matrix R is initial-
ized. In the second phase, during the k
th
iteration,
the role similarity score between the node pair (u, v)
is computed based on the role similarity scores from
the previous (k − 1)
th
iteration. This computation is
performed using the following equation:
FaRS: A High-Performance Automorphism-Aware Algorithm for Graph Similarity Matching
19