CrossSense - Sensemaking in a Folksonomy with Cross-modal Clustering over Content and User Activities

Hans-Henning Gabriel, Myra Spiliopoulou, Alexandros Nanopoulos

2010

Abstract

Today folksonomies are of increasing importance, many different platforms emerged and millions of people use them. We consider the case of a user who enters such a social platform and wants to get an overview of a particular domain. The folksonomy provides abundant information for that task in the form of documents, tags on them and users who contribute documents and tags. We propose a process that identifies a small number of thematically ”interesting objects” with respect to subject domains. Our novel algorithm CrossSense builds clusters composed of objects of different types upon a data tensor. It then selects pivot objects that are characteristic of one cluster and are associated with many objects of different types from the clusters. Then, CrossSense collects all the folksonomy content that is associated with a pivot object, i.e. the object’s world: We rank pivot objects and present the top ones to the user. We have experimented with Bibsonomy data against a baseline that selects the most popular users, documents and tags, accompanied by the objects most frequently co-occurring with them. Our experiments show that our pivot objects exhibit more homogeneity and constitute a smaller set of entities to be inspected by the user.

References

  1. Abdu, E. and Salane, D. (2009). A spectral-based clustering algorithm for categorical data using data summaries. In KDD Workshop on Data Mining using Matrices and Tensors.
  2. Aggarwal, C. C. and Yu, P. S. (2006). A Framework for Clustering Massive Text and Categorical Data Streams. In Proceedings of the SIAM conference on Data Mining 2006.
  3. Banerjee, A., Basu, S., and Merugu, S. (2007). Multi-way clustering on relation graphs. In SDM.
  4. Begeman, G., Keller, P., and Smadja, F. (2006). Automated tag clustering: Improving search and exploration in the tag space. In Proceedings of the WWW'2006 Workshop on Collaborative Web Tagging.
  5. Cattuto, C., Benz, D., Hotho, A., and Stumme, G. (2008). Semantic grounding of tag relatedness in social bookmarking systems. In ISWC 7808: Proceedings of the 7th International Conference on The Semantic Web, pages 615-631, Berlin, Heidelberg. Springer-Verlag.
  6. de Gemmis, M., Lops, P., Semeraro, G., and Basile, P. (2008). Integrating tags in a semantic content-based recommender. In RecSys 7808: Proceedings of the second ACM Conference on Recommender Systems, pages 163-170, Lausanne, Switzerland. ACM.
  7. Golder, S. and Huberman, B. (2006). Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2):198-208.
  8. Heymann, P. and Garcia-Molina, H. (2006). Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical report, InfoLab, Computer Science Department, Stanford University.
  9. Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1):177-196.
  10. Ipeirotis, P., Ntoulas, A., Cho, J., and Gravano, L. (2005). Modeling and managing content changes in text databases. In Proceedings of the IEEE Int. Conf. on Data Engineering (ICDE'05).
  11. Markines, B., Cattuto, C., Menczer, F., Benz, D., Hotho, A., and Stumme, G. (2009a). Evaluating similarity measures for emergent semantics of social tagging. In WWW 7809: Proceedings of the 18th international conference on World wide web, pages 641-650, New York, NY, USA. ACM.
  12. Markines, B., Cattuto, C., Menczer, F., Benz, D., Hotho, A., and Stumme, G. (2009b). Evaluating similarity measures for emergent semantics of social tagging. In WWW'09, pages 641-650, Madrid, Spain.
  13. Osinski, S. (2006). Improving quality of search results clustering with approximate matrix factorisations. In Proceedings of the European Conf. on Information Retrieval (ECIR'06), LNCS 3936, pages 167-178.
  14. Sun, J., Tao, D., and Faloutsos, C. (2006). Beyond streams and graphs: dynamic tensor analysis. In KDD 7806: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 374-383, New York, NY, USA. ACM.
  15. Symeonidis, P., Nanopoulos, A., and Manolopoulos, Y. (2010). A unified framework for providing recommendations in social tagging systems based on ternary semantic analysis. IEEE Transactions on Knowledge and Data Engineering, 22(2):179-192.
  16. Zhou, Q., Xu, G., and Zong, Y. (2009). Web co-clustering of usage network using tensor decomposition. In WIIAT 7809: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, pages 311-314, Washington, DC, USA. IEEE Computer Society.
Download


Paper Citation


in Harvard Style

Gabriel H., Spiliopoulou M. and Nanopoulos A. (2010). CrossSense - Sensemaking in a Folksonomy with Cross-modal Clustering over Content and User Activities . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 100-111. DOI: 10.5220/0003101101000111


in Bibtex Style

@conference{kdir10,
author={Hans-Henning Gabriel and Myra Spiliopoulou and Alexandros Nanopoulos},
title={CrossSense - Sensemaking in a Folksonomy with Cross-modal Clustering over Content and User Activities},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={100-111},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003101101000111},
isbn={978-989-8425-28-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - CrossSense - Sensemaking in a Folksonomy with Cross-modal Clustering over Content and User Activities
SN - 978-989-8425-28-7
AU - Gabriel H.
AU - Spiliopoulou M.
AU - Nanopoulos A.
PY - 2010
SP - 100
EP - 111
DO - 10.5220/0003101101000111