Distributed K-Median Clustering with Application to Image Clustering
Aiyesha Ma, Ishwar K. Sethi
2007
Abstract
Developing algorithms suitable for distributed environments is important as data becomes more distributed. This paper proposes a distributed K-Median clustering algorithm for use in a distributed environment with centralized server, such as the Napster model in a peer-to-peer environment. Several approximate methods for computing the median in a distributed environment are proposed and analyzed in the context of the iterative K-Median algorithm. The proposed algorithm allows the clustering of multivariate data while ensuring that each cluster representative remains an item in the collection. This facilitates exploratory analysis where retaining a representative in the collection is important, such as imaging applications.
References
- Lawrence, R.D., Almasi, G.S., Rushmeier, H.E.: A scalable parallel algorithm for selforganizing maps with applications to sparse data mining problems. Data Mining and Knowledge Discovery 3 (1999) 171-195
- Dhillon, I.S., Modha, D.S.: A data clustering algorithm on distributed memory multiprocessors. Large-Scale Parallel Data Mining, Lecture Notes in Artificial Intelligence 1759 (2000) 245-260
- Jin, R., Goswami, A., Agrawal, G.: Fast and exact out-of-core and distributed k-means clustering. Knowledge and Information System Journal (2005) Online first.
- Müller, W., Henrich, A.: Fast retrieval of high-dimensional feature vectors in P2P networks using compact peer data summaries. In: MIR 7803: Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval, New York, NY, USA, ACM Press (2003) 79-86
- Müller, W., Eisenhardt, M., Henrich, A.: Scalable summary based retrieval in P2P networks. In: CIKM 7805: Proceedings of the 14th ACM international conference on Information and knowledge management, New York, NY, USA, ACM Press (2005) 586-593
- Blanquer, I., Hernndez, V., Mas, F.: A P2P platform for sharing radiological images and diagnoses. In: Proc. Medical Image Computing and Computer Assisted Intervention (MICCAI). (2004)
- King, I., Ng, C.H., Sia, K.C.: Distributed content-based visual information retrieval system on peer-to-peer networks. ACM Trans. Inf. Syst. 22 (2004) 477-501
- Yang, Z.: Interactive content-based image retrieval in the peer-to-peer network using selforganizing maps. In: HUT T-110.551 Seminar on Internetworking. (2005)
- Fischer, G., Nurzenski, A.: Towards scatter/gather browsing in a hierarchical peer-to-peer network. In: P2PIR'05: Proceedings of the 2005 ACM workshop on Information retrieval in peer-to-peer networks, New York, NY, USA, ACM Press (2005) 25-32
- Manjunath, B.S., Ohm, J.R., Vasudevan, V.V., Yamada, A.: Color and texture descriptors. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 11 (2001)
- Sikora, T.: The mpeg-7 visual standard for content descriptionan overview. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 11 (2001)
Paper Citation
in Harvard Style
Ma A. and K. Sethi I. (2007). Distributed K-Median Clustering with Application to Image Clustering . In Proceedings of the 7th International Workshop on Pattern Recognition in Information Systems - Volume 1: PRIS, (ICEIS 2007) ISBN 978-972-8865-93-1, pages 215-220. DOI: 10.5220/0002425402150220
in Bibtex Style
@conference{pris07,
author={Aiyesha Ma and Ishwar K. Sethi},
title={Distributed K-Median Clustering with Application to Image Clustering},
booktitle={Proceedings of the 7th International Workshop on Pattern Recognition in Information Systems - Volume 1: PRIS, (ICEIS 2007)},
year={2007},
pages={215-220},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002425402150220},
isbn={978-972-8865-93-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 7th International Workshop on Pattern Recognition in Information Systems - Volume 1: PRIS, (ICEIS 2007)
TI - Distributed K-Median Clustering with Application to Image Clustering
SN - 978-972-8865-93-1
AU - Ma A.
AU - K. Sethi I.
PY - 2007
SP - 215
EP - 220
DO - 10.5220/0002425402150220