5 Conclusion
This paper presents a k-median clustering approach for use in a distributed environment,
such as a peer-to-peer system. While the presented approach uses the Napster model of
a centralized coordinator and index, the clustering method could be extended to de-
centralized models by deciding on a communication scheme.
This paper compared several methods for computing an approximate median using
only summary data for each peer and the approaches were analyzed within the con-
text of the k-median clustering algorithm. It was noted that variations in data distribu-
tion (such as random versus expert) affected the performance of the proposed methods.
Overall, two approaches performed well regardless, but had other trade-offs to consider.
The results of image clustering showed that enough similarities exist between the
clusters produced with the non-distributed clustering and those produced with dis-
tributed clustering to ensure that browsing and indexing methods using the approximate
approaches in the distributed environment are possible. Furthermore the clustering al-
gorithm worked well given the limitations of the feature vector used.
References
1. Lawrence, R.D., Almasi, G.S., Rushmeier, H.E.: A scalable parallel algorithm for self-
organizing maps with applications to sparse data mining problems. Data Mining and Knowl-
edge Discovery 3 (1999) 171–195
2. Dhillon, I.S., Modha, D.S.: A data clustering algorithm on distributed memory multiproces-
sors. Large-Scale Parallel Data Mining, Lecture Notes in Artificial Intelligence 1759 (2000)
245–260
3. Jin, R., Goswami, A., Agrawal, G.: Fast and exact out-of-core and distributed k-means
clustering. Knowledge and Information System Journal (2005) Online first.
4. M
¨
uller, W., Henrich, A.: Fast retrieval of high-dimensional feature vectors in P2P networks
using compact peer data summaries. In: MIR ’03: Proceedings of the 5th ACM SIGMM
international workshop on Multimedia information retrieval, New York, NY, USA, ACM
Press (2003) 79–86
5. M
¨
uller, W., Eisenhardt, M., Henrich, A.: Scalable summary based retrieval in P2P networks.
In: CIKM ’05: Proceedings of the 14th ACM international conference on Information and
knowledge management, New York, NY, USA, ACM Press (2005) 586–593
6. Blanquer, I., Hernndez, V., Mas, F.: A P2P platform for sharing radiological images and diag-
noses. In: Proc. Medical Image Computing and Computer Assisted Intervention (MICCAI).
(2004)
7. King, I., Ng, C.H., Sia, K.C.: Distributed content-based visual information retrieval system
on peer-to-peer networks. ACM Trans. Inf. Syst. 22 (2004) 477–501
8. Yang, Z.: Interactive content-based image retrieval in the peer-to-peer network using self-
organizing maps. In: HUT T-110.551 Seminar on Internetworking. (2005)
9. Fischer, G., Nurzenski, A.: Towards scatter/gather browsing in a hierarchical peer-to-peer
network. In: P2PIR’05: Proceedings of the 2005 ACM workshop on Information retrieval in
peer-to-peer networks, New York, NY, USA, ACM Press (2005) 25–32
10. Manjunath, B.S., Ohm, J.R., Vasudevan, V.V., Yamada, A.: Color and texture descriptors.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
11 (2001)
11. Sikora, T.: The mpeg-7 visual standard for content descriptionan overview. IEEE TRANS-
ACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 11 (2001)
220