considered approaches have initially been evaluated
by applying the algorithms on data extracted from Pu-
bMed repository. The produced clustering solutions
have been validated on two different datasets by two
different cluster validation measures: F-measure and
Silhouette Index (SI). The two Bipartite Correlation
Clustering (BCC) algorithms have slightly outperfor-
med the Partitioning-based on average with respect to
SI on the first data set. The Merge-Split PBC algo-
rithm has also demonstrated better performance than
the other two algorithms on the second data set. This
algorithm is able to analyze the correlations between
two clustering solutions and based on the discovered
patterns it treats the clusters in different ways. In ad-
dition, in comparison to the Partitioning-based clus-
tering algorithm the two BCC algorithms do not need
prior knowledge about the optimal number of clusters
in order to produce a good clustering solution. The
BCC algorithms are also more suitable for the consi-
dered expertise retrieval context, because each cluster
is modelled by a list of domain-specific topics, i.e.
analogously to the experts’ expertise profiles.
For future work, we aim to pursue further compa-
rison and evaluation of the three proposed clustering
approaches on richer data extracted from different on-
line sources.
