Closing the Loop on a Complete Linkage Hierarchical Clustering Method

David Allen Olsen

2014

Abstract

To develop a complete linkage hierarchical clustering method that 1) substantially improves upon the accuracy of the standard complete linkage method and 2) can be fully automated or used with minimal operator supervision, the assumptions underlying the standard complete linkage method are unwound, evaluating pairs of data points for linkage is decoupled from constructing cluster sets, and cluster sets are constructed de novo. These design choices make it possible to construct only the cluster sets that correspond to select, possibly non-contiguous levels of an n(n-1)/2 + 1-level hierarchical sequence. To construct meaningful cluster sets without constructing an entire hierarchical sequence, a means that uses distance graphs is used to find meaningful levels of such a hierarchical sequence. This paper presents an approach that mathematically captures the graphical relationships that are used to find meaningful levels and integrates the means into the new clustering method. The approach is inexpensive to implement. Consequently, the new clustering method is self-contained and incurs almost no extra cost to determine which cluster sets should be constructed and which should not. Empirical results from four experiments show that the approach does well at finding meaningful levels of hierarchical sequences.

References

  1. Gill, H. (2011). CPS overview. In Symposium on Control and Modeling Cyber-Physical Systems (www.csl.illinois.edu/video/csl-emergingtopics-2011-cyber-physical-systems-helen-gillpresentation), Champaign, IL.
  2. Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P., Mark, R., Mietus, J., Moody, G., Peng, C., and Stanley, H. (June 13, 2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 101(23):e215-e220 [Circulation Electronic Pages; http://cir.ahajournals.org/cgi/content/full/101/ 23/e215].
  3. Murtagh, F. (2009). The remarkable simplicity of very high dimensional data: Application of model-based clustering. J. of Classification, 26:249-277.
  4. Navidi, W. (2006). Statistics for Engineers and Scientists. McGraw-Hill.
  5. Olsen, D. (2014a). Include hierarchical clustering: A hierarchical clustering method based solely on interpoint distances. Technical report, Minneapolis, MN.
  6. Olsen, D. (2014b). Means for finding meaningful levels of a hierarchical sequence prior to performing a cluster analysis. In Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2014), Vienna, Austria.
  7. Peay, E. (1974). Hierarchical clique structures. Sociometry, 37(1):54-65.
  8. Peay, E. (1975). Nonmetric grouping: Clusters and cliques. Psychometrika, 40(3):297-313.
Download


Paper Citation


in Harvard Style

Allen Olsen D. (2014). Closing the Loop on a Complete Linkage Hierarchical Clustering Method . In Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-989-758-039-0, pages 296-303. DOI: 10.5220/0005058902960303


in Bibtex Style

@conference{icinco14,
author={David Allen Olsen},
title={Closing the Loop on a Complete Linkage Hierarchical Clustering Method},
booktitle={Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2014},
pages={296-303},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005058902960303},
isbn={978-989-758-039-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - Closing the Loop on a Complete Linkage Hierarchical Clustering Method
SN - 978-989-758-039-0
AU - Allen Olsen D.
PY - 2014
SP - 296
EP - 303
DO - 10.5220/0005058902960303