Clustering Stability and Ground Truth: Numerical Experiments

Maria José Amorim, Margarida Cardoso


Stability has been considered an important property for evaluating clustering solutions. Nevertheless, there are no conclusive studies on the relationship between this property and the capacity to recover clusters inherent to data (“ground truth”). This study focuses on this relationship resorting to synthetic data generated under diverse scenarios (controlling relevant factors). Stability is evaluated using a weighted cross-validation procedure. Indices of agreement (corrected for agreement by chance) are used both to assess stability and external validation. The results obtained reveal a new perspective so far not mentioned in the literature. Despite the clear relationship between stability and external validity when a broad range of scenarios is considered, within-scenarios conclusions deserve our special attention: faced with a specific clustering problem (as we do in practice), there is no significant relationship between stability and the ability to recover data clusters.


  1. Ben-David, S. & Luxburg, U. V., 2008. Relating clustering stability to properties of cluster boundaries. In: Servedio, R. & Zhang, T., eds. 21st Annual Conference on Learning Theory (COLT), Berlin. Springer, 379-390.
  2. Biernacki, C., Celeux, G., Govaert, G. & Langrognet, F., 2006. Model-Based Cluster and Discriminant Analysis with the MIXMOD Software. Computational Statistics and Data Analysis, 51, 587-600.
  3. Bubeck, S., Meila, M. & Von luxburg, U., 2012. How the initialization affects the stability of the k-means algorithm. ESAIM: Probability and Statistics, 16, 436- 452.
  4. Cardoso, M. G., Faceli, K. & De Carvalho, A. C., 2010. Evaluation of Clustering Results: The Trade-off BiasVariability. Classification as a Tool for Research. Springer, 201-208.
  5. Cardoso, M. G. M. S., 2007. Clustering and CrossValidation. In: C. Ferreira, C. L., G. Saporta And M. Souto De Miranda, ed. IASC 07 - Statistics for Data Mining, Learning and Knowledge Extraction, Aveiro, Portugal.
  6. Celeux, G. & Diebolt, J., 1985. The SEM Algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly, 2, 73-82.
  7. Chiang, M. M.-T. & MIrkin, B., 2010. Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads. Journal of Classification, 27, 3-40.
  8. Dempster, A. P., Laird, N. M. & Rubin, D. B., 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistics Society. Series B (Methodological), 39, 1-38.
  9. Hartigan, J. A., 1975. Clustering algorithms.
  10. Hennig, C., 2007. Cluster-wise assessment of cluster stability. Computational Statistics & Data Analysis, 52, 258-271.
  11. Horibe, Y., 1985. Entropy and correlation. Systems, Man and Cybernetics, IEEE Transactions on, 5, 641-642.
  12. Hubert, L. & Arabie, P., 1985. Comparing partitions. Journal of Classification, 2, 193-218.
  13. Jain, A. K. & Dubes, R. C., 1988. Algorithms for clustering data, Englewood Cliffs, N.J.: Prentice Hall.
  14. Kraskov, A., Stögbauer, H., Andrzejak, R. G. & Grassberger, P., 2005. Hierarchical clustering using mutual information. EPL (Europhysics Letters), 70, 278.
  15. Lange, T., Roth, V., Braun, M. L. & Buchman, J. M., 2004. Stability based validation of clustering solutions. Neural Computation, 16, 1299-1323.
  16. Lebret, R., S., L., Langrognet, F., Biernacki, C., Celeux, G. & Govaert, G., 2012. Rmixmod: The r package of the model-based unsupervised, supervised and semisupervised classification [Online]. Rmixmod library. html
  17. Luxburg, U. V., 2009. Clustering Stability: An Overview. Machine Learning, 2, 235-274.
  18. Maitra, R. & Melnykov, V., 2010. Simulating data to study performance of finite mixture modeling and clustering algorithms. Journal of Computational and Graphical Statistics, 19, 354-376.
  19. Mcintyre, R. M. & Blashfield, R. K., 1980. A nearestcentroid technique for evaluating the minimumvariance clustering procedure. Multivariate Behavioral Research, 2, 225-238.
  20. Milligan, G. W. & Cooper, M. C. 1985. An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159-179.
  21. Rand, W. M., 1971. Objective Criteria for the Evaluation of Clustering Methods. Journal of the American Statistical Association, 66, 846-850.
  22. Steinley, D. & Henson, R., 2005. OCLUS: an analytic method for generating clusters with known overlap. Journal of Classification, 22, 221-250.
  23. Vendramin, L., Campello, R. J. & Hruschka, E. R., 2010. Relative clustering validity criteria: A comparative overview. Statistical Analysis and Data Mining, 3, 209- 235.
  24. Vinh, N. X., Epps, J. & Bailey, J., 2010. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. The Journal of Machine Learning Research, 11, 2837-2854.
  25. Warrens, M. J., 2008. On similarity coefficients for 2× 2 tables and correction for chance. Psychometrika, 73, 487-502.

Paper Citation

in Harvard Style

Amorim M. and Cardoso M. (2015). Clustering Stability and Ground Truth: Numerical Experiments . In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015) ISBN 978-989-758-158-8, pages 259-264. DOI: 10.5220/0005597702590264

in Bibtex Style

author={Maria José Amorim and Margarida Cardoso},
title={Clustering Stability and Ground Truth: Numerical Experiments},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)},

in EndNote Style

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)
TI - Clustering Stability and Ground Truth: Numerical Experiments
SN - 978-989-758-158-8
AU - Amorim M.
AU - Cardoso M.
PY - 2015
SP - 259
EP - 264
DO - 10.5220/0005597702590264