C = {pairs that are in the same cluster under C
′
but
different in cluster C
′′
}
D = {pairs that are in different cluster under C
′
but in
the same cluster under C
′′
}
Therefore the Rand index to measure the similarity
between two clustering C
′
and C
′′
is given by
R =
|A| + |B|
|A| + |B| + |C| + |D|
=
|A| + |B|
N
2
=
2(|A| + |B|)
N(N − 1)
Here |A| + |B| can be considered as the number of
agreements between C
′
and C
′′
, and |C| + |D| as the
number of disagreements between C
′
and C
′′
. Notice
that as the denominator |A| + |B| + |C|+ |D| is the to-
tal number of unordered pairs of nodes, it can be writ-
ten as
N
2
. Now here the Rand index represents the
frequency of occurrence of agreements over the total
pairs of nodes, or the probability that C
′
and C
′′
will
agree on a randomly chosen pair of nodes. Mathemat-
ically the sets can be written as:
A = {(o
i
,o
j
) | o
i
,o
j
∈ C
′
k
,o
i
,o
j
∈ C
′′
l
}
B = {(o
i
,o
j
) | o
i
∈ C
′
k
1
,o
j
∈ C
′
k
2
,o
i
∈ C
′′
l
1
,o
j
∈ C
′′
l
2
}
C = {(o
i
,o
j
) | o
i
,o
j
∈ C
′
k
,o
i
∈ C
′′
l
1
,o
j
∈ C
′′
l
2
}
D = {(o
i
,o
j
) | o
i
∈ C
′
k
1
,o
j
∈ C
′
k
2
,o
i
,o
j
∈ C
′′
l
}
for some 1 ≤ i, j ≤ n,i ̸= j,1 ≤ k,k
1
,k
2
≤ r,k
1
̸=
k
2
,1 ≤ l,l
1
,l
2
≤ s,l
1
̸= l
2
.
For illustrative purposes, consider a network (Fig-
ure 2 {a,b,c,d,e, f ,g,h,i, j, k}. Further, let us have
two partitions C
′
= {{a,b,c,d,e, f },{g,h, i, j, k}}
and C
′′
= {{a,b, c,d, e},{ f , g},{h, i, j,k}}.
The meaning of these clusters is - network 1
has 6 nodes ({a, b,c, d,e, f }) and network 2
has 5 nodes ({g,h, i, j,k}) and they have two
common nodes { f ,g}. Then we have A =
{(a,b), (a,c), (a,d), (a,e),(b,c),(b,d),(b,e),(c,d),
(c,e)(d, e),(h, i),(h, j),(h,k),(i, j),(i,k),( j,k)}
with |A| = 16. Further B =
{(a,g), (a,h), (a,i),(a, j),(a,k), (b,g), (b,h)...}
with |B| = 29. Therefore we get R =
2(16 + 29)/(110) = 0.81. Now if we consider
C
′′
= {{a,b,c,d},{e, f ,g,h},{i, j, k}} i.e. Figure 4,
then rand index turn out 0.67. This means the fork has
been reduced from 0.81 to 0.67 just by considering
another partition, and therefore Rand Index is highly
dependent upon the number of clusters. Rand index
can be computed easily using the following code in
R-statistical programming language:
library(fossil)
#define clusters C1 and C2
C1 = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2)
C2 = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3)
#calculate Rand index between C1 and C2
rand.index(C1, C2)
We have analysed a spectrum of cases of agree-
ment and it is clear that the more the interconnections
among clusters, better is the fairness in terms of in-
centive and load/node distribution in the blockchain
network. In a similar manner, we can consider these
intersection nodes as open permissioned nodes which
actually support PoS consensus. Figure 1 to Figure 4
diagrams show the importance of interconnections to
influence fairness.
Figure 1 shows Cluster 1 and Cluster 2 and ini-
tially that do not have any common nodes or mem-
bers. Each cluster achieves correct consensus inde-
pendently, and hence, they violate the agreement. The
connectivity of the two clusters reduces disagreement
between them. The connectivity represents the com-
mon members from both clusters who arrive at a sin-
gle point of agreement. More the common members,
the higher the agreement. Fig 1 represents two sep-
arate clusters having independent consensus to fol-
low. In this case, the fork formation is maximum as
there is no point of agreement (no common members).
Through rand-index as explained above, we calculate
the number of times a pair of elements belong to the
same cluster across two different clustering methods
along with the number of times a pair of elements are
in different clusters across two different clusters over
all possibilities.
In Figure 2, the rand-index is 81% where the com-
mon members are 18% of the total nodes of two clus-
ters. That means, unlike 100% fork as in Figure 1,
in Figure 2 fork is reduced to 81%. As we increase
the common members’ percentage of two clusters, we
can observe the fork percentage gets reduced. In Fig-
ure 3, the rand index is 74% whereas the common
member involved percentage is 27%. In Figure 4, val-
ues are 67% and 36% respectively.
Figure 1: Two separate clusters having independent consen-
sus.
Our further step is to distribute the percentage of
nodes of the entire network among permissionless
and permissioned consensus under a byzantine fault-
A Rand Index-Based Analysis of Consensus Protocols
573