as the fraction of shared samples in matching clusters of P
∗
and P
o
. When
data partitions have the same number of clusters, C
i
(P
∗
, P
o
) is equal to the
percentage of correct labelling.
EAC Graph Finite Mixture
Data Set K
i
SL CL AL WL Cent CSPA HGPA MCLA Max Mean STD L
Rings
Spectral
3 61.4 44.6 48.4 48.4 48.4 45.0 25.4 41.6 61.8 46.5 9.0 61.8
4 47.6 51.4 50.0 50.4 50.0 63.2 25.4 43.0 85.8 54.3 13.0 49.8
20 80.0 40.0 81.8 79.6 59.8 70.4 72.8 59.2 55.0 47.3 6.0 45.6
All 80.4 46.0 50.8 48.2 46.2 67.0 51.6 50.4 62.0 46.0 6.0 50.2
Kmeans All 85.6 40.0 44.6 59.4 51.0 47.8 67.2 61.2 60.60 50.30 8.00 59.4
Dist 58.8 36.8 34.0 43.4 43.6
Half Rings
Spectral
3 64.6 85.8 86.2 87.6 86.2 83.2 45.4 84.6 85.2 68.7 9.0 83.2
8 69.8 41.2 97.2 92.8 58.2 93.4 92.6 95.0 72.8 56.7 9.0 62.6
All 95.0 87.6 95.0 95.0 95.0 93.2 89.2 93.0 74.6 58.6 12.0 51.0
Kmeans All 99.8 38.2 95.0 95.0 51.6 93.4 95.0 93.8 56.60 46.30 6.00 56.6
Dist 95.0 72.0 73.4 73.6 73.6
Cigar
Spectral
4 100.0 100.0 100.0 100.0 100.0 71.2 34.4 100.0 100.0 88.8 11.0 100.0
5 100.0 61.6 100.0 100.0 100.0 70.8 41.2 73.6 100.0 81.8 9.0 100.0
8 100.0 61.6 100.0 70.4 79.6 70.0 70.4 66.8 72.8 59.4 8.0 58.8
All 100.0 100.0 100.0 100.0 100.0 70.4 74.8 63.2 54.0 44.8 7.0 44.4
Kmeans All 100.0 64.0 71.0 70.0 67.0 60.0 73.0 61.0 74.40 51.60 11.00 74.4
Dist 60.4 55.6 87.2 58.0 51.6
Bars
Spectral
2 96.8 96.8 96.8 96.5 96.8 99.0 50.0 96.8 97.0 96.7 0.3 96.8
15 99.5 55.0 99.5 99.5 55.8 99.2 97.0 99.5 80.5 62.9 10.0 80.5
All 98.8 97.5 97.5 98.8 97.5 97.8 98.8 98.8 75.5 59.4 8.0 75.5
Kmeans All 54.3 55.5 98.8 98.8 57.5 99.0 98.0 99.0 79.30 63.00 10.00 74.2
Dist 50.2 98.8 98.8 99.5 99.5
Log Yeast
Spectral
4 31.5 25.8 33.6 36.2 34.1 35.2 22.1 36.2 37.2 35.5 1.0 37.2
5 34.4 41.9 37.2 37.8 37.2 34.9 31.3 44.5 37.8 34.0 4.0 34.4
6 34.6 33.1 38.3 39.3 38.3 36.2 30.7 38.0 46.6 35.9 7.0 33.6
20 35.9 34.4 45.3 37.8 37.5 35.4 40.1 39.3 44.0 39.7 3.0 36.2
All 34.4 30.2 36.5 36.5 35.7 32.6 29.2 32.8 44.3 42.0 2.00 40.4
KMeans All 37.0 27.0 41.0 35.0 41.0 34.0 32.0 32.0 39.80 36.40 3.00 36.20
Dist 34.9 28.9 28.6 35.9 30.7
Std Yeast
Spectral
4 35.7 66.7 66.4 64.3 66.9 60.4 38.0 66.1 64.6 59.8 5.0 64.6
5 49.2 57.3 66.1 65.4 66.1 59.4 38.8 64.3 65.9 59.6 6.0 65.9
6 45.6 61.5 68.2 65.4 68.2 56.5 37.2 67.2 63.8 57.8 5.0 60.9
20 36.2 43.2 62.8 52.3 39.3 55.5 56.5 59.1 60.9 54.3 6.0 57.0
All 44.8 65.6 65.9 65.4 65.1 56.8 59.4 58.9 64.1 52.8 12.0 64.1
KMeans All 48.0 54.0 67.0 56.0 45.0 53.0 57.0 54.0 51.60 45.20 8.00 39.30
Dist 36.2 66.7 65.9 66.9 57.8
Optical
Spectral
9 60.5 79.0 77.3 84.3 77.3 79.5 35.2 79.5 77.1 65.3 9.0 77.1
10 70.0 77.3 77.6 84.5 77.6 80.5 33.4 84.5 75.6 69.7 6.0 67.9
15 72.2 53.0 72.2 88.3 73.8 86.3 52.7 78.1 75.9 67.4 8.0 75.9
All 60.4 75.0 79.1 87.3 79.1 88.1 45.4 77.1 72.6 67.8 4.0 72.6
Kmeans All 40.0 51.0 79.0 80.0 71.0 84.0 78.0 88.0 64.00 57.90 4.00 54.30
Dist 10.6 54.1 75.7 74.8 10.6
Table 1. Combination Results - C
i
(P
∗
, P
o
) - for Spectral and K-means Clus-
tering Ensembles. Best results for each clustering ensemble are represented in
bold style.
Table 1 shows the C
i
(P
∗
, P
o
) index for spectral and K-means clustering
ensembles with both synthetic and real data sets (see first column). In this
table, rows are grouped by clustering ensemble construction method (second
column). Rows corresponding to Spectral Clustering Ensembles, with numerical
labels in the K
i
column indicate that the clustering ensemble has only par-
titions with K
i
clusters (method (i) in section 2.1). For each K, a clustering
ensemble with N = 22 data partitions was produced by assigning σ values be-
tween 0.08 and 0.5, with increments of 0.02 (schematically described by the
notation [0.08:0.02:0.5]). We have tested with K taking all the values in the set
K = {2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20}; due to space limitations, only a small number
of experimental results are presented. Rows corresponding to Spectral Clustering
Ensembles, with the label ”All” (method (ii) in section 2.1) correspond to the
union of all the partitions produced by method (i) with K taking all the values
228