Table 2 shows the results for the CAL cluster-
ing algorithm using the RAL process. By compar-
ing all the results that use the Silhouette index, it can
be seen that the traditional Silhouette index selected
the best data partition in 7 out of the 14 data sets,
the same number obtained the corresponding weight-
ing approach despite the selected partitions are not
the same. The results obtained by learning a dis-
tance function were slightly better. With and with-
out weighting scores, the indices picked the best par-
titions 8 times. By doing the same analysis for the
Normalized Hubert’s statistic results, we noticed that
the traditional approach only identifies the best parti-
tion in 5 out of the 14 data sets. The weighted score
approach with the original distance metric choose the
best partition 6 times. By using a learned distance
metric the best partition is selected 8 times and by
combining it with the constraint satisfaction the best
partitions are picked in 9 out 14 data sets.
The results for the PCKM clustering algorithm
using the RAL process are shown in Table 3. The
worst results were again achieved by the traditional
validity approach, having the best partitions been se-
lected only 4 times by both the Silhouette index and
the Normalized Hubert’s statistic. The results for the
weighted score approach are somewhat better. Both
indices selected 6 times the best partition. By learn-
ing a new distance metric the results are considerable
better. The Silhouette index selects the best partition 8
and 6 times with and without using the score weight-
ing, respectively. The Normalized Hubert’s statistic
identifies the best data partitions 9 out of the 14 data
sets both with and without the score weighting.
Table 4 shows the results for the CAL clustering
algorithm using the RAC process. The traditional Sil-
houette index picked the best partitions 8 times and
with the score weighting 10 times, both with the orig-
inal and learned metrics. For the simple distance
learning approach, the best partitions were selected
in 8 out of 14 data sets. The Normalized Hubert’s
statistic results were not so good by selecting only
3, 4, 8 and 6 times the best partition for the tra-
ditional, score weighting, distance learning and dis-
tance learning plus score weighting approaches, re-
spectively. Nonetheless, the results obtained using
constraints are better, especially when the distance
metric was learned.
The results for the PCKM algorithm using the
RAC process are presented in table 5. The traditional
Silhouette index determined the best data partitions 8
times and the corresponding weighted score approach
identified the same best partitions plus another one.
The simple distance learning approach selected the
best partition in 9 out of the 14 data sets and com-
bining it with the weighted score approach decreases
the number of identified best partitions by one. The
traditional Normalized Hubert’s statistic only selects
the best partition 5 times and the score weighting ap-
proach in 8 times. The simple metric learning ap-
proaches picks the best partition also 8 times and,
again, the weighted score with distance learning ap-
proach diminishes the number of identified best par-
titions by one. We may conclude from the previous
results that the incorporation of constraints clearly in-
creases the performance of the clustering validation
process. By simple weighting a validity index score
with the constraint satisfaction ratio the results were
better. Also, it seems that learning a new metric based
on the pairwise constraints leads to even better results.
Table 6 indicates the number of times that each va-
lidity measure (by line) achieved better/worse/equal
results than the other validity measures (by column).
The Silhouette with the score weighting approach ob-
tained 14 times better results than the traditional Sil-
houette and 9 times worse. The Silhouette with dis-
tance learning achieved 10 times better partitions and
only 6 times worse. The metric learning combined
with the score weighting achieved 16 better results
and 10 worse. By performing the same analysis for
the Normalized Hubert’s statistic, the weighted score
approach was better than the traditional one 15 times
and 9 worse. The distance learning approach obtained
better results 21 times and 7 worse. The combination
of metric learning and score weighting obtained 22
improvements and only 12 results worst. These re-
sults evidence again that constrained clustering vali-
dation outperforms the traditional validation approach
especially when using distance learning.
Figure 6 shows the plots of the consistency index
values and the constraint satisfaction ratio obtained
for all partitions produced in our experiments versus
each internal validation index, distinguished by clus-
tering algorithm and constraint acquisition method,
and table 7 presents the correlation between the in-
ternal validation indices and the consistency index. It
can be seen that very different consistency values may
be achieved for partitions with all constraints satisfied
(figure 6c). This indicates that the constraint satisfac-
tion ratio alone is not a good indicator of the qual-
ity of the partitions which is corroborated by the low
correlation with the consistency index. We can also
conclude that the validation approaches that use dis-
tance learning have higher correlation with the con-
sistency index, which is another indication that these
are a good option for clustering validation.
DataClusteringValidationusingConstraints
23