COLLABORATIVE OLAP WITH TAG CLOUDS - Web 2.0 OLAP Formalism and Experimental Evaluation

Kamel Aouiche, Daniel Lemire, Robert Godin

2008

Abstract

Increasingly, business projects are ephemeral. New Business Intelligence tools must support ad-lib data sources and quick perusal. Meanwhile, tag clouds are a popular community-driven visualization technique. Hence, we investigate tag-cloud views with support for OLAP operations such as roll-ups, slices, dices, clustering, and drill-downs. As a case study, we implemented an application where users can upload data and immediately navigate through its ad hoc dimensions. To support social networking, views can be easily shared and embedded in other Web sites. Algorithmically, our tag-cloud views are approximate range top-k queries over spontaneous data cubes. We present experimental evidence that iceberg cuboids provide adequate online approximations. We benchmark several browser-oblivious tag-cloud layout optimizations.

References

  1. Alon, N., Matias, Y., and Szegedy, M. (1996). The space complexity of approximating the frequency moments. In STOC 7896, pages 20-29.
  2. Ben Messaoud, R., Boussaid, O., and Loudcher Rabaséda, S. (2006). Efficient multidimensional data representations based on multiple correspondence analysis. In KDD'06, pages 662-667.
  3. Bhasker, J. and Sahni, S. (1987). Optimal linear arrangement of circuit components. J. VLSI Comp. Syst., 2(1):87-109.
  4. Body, M., Miquel, M., Bédard, Y., and Tchounikine, A. (2002). A multidimensional and multiversion structure for OLAP applications. In DOLAP 7802, pages 1-6.
  5. Booth, K. S. and Lueker, G. S. (1976). Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms. Journal of Computer and System Sciences, 13:335-379.
  6. Butler, D. (2007). Data sharing: the next generation. Nature, 446(7131):1-10.
  7. Carey, M. J. and Kossmann, D. (1997). On saying “enough already!” in SQL. In SIGMOD'97, pages 219-230.
  8. Chazelle, B. (1988). A functional approach to data structures and its use in multidimensional searching. SIAM J. Comput., 17(3):427-462.
  9. Chen, Q., Dayal, U., and Hsu, M. (2000). OLAP-based data mining for business intelligence applications in telecommunications and e-commerce. In DNIS 7800, pages 1-19.
  10. Chung, Y., Yang, W., and Kim, M. (2007). An efficient, robust method for processing of partial top-k/bottom-k queries using the RD-tree in OLAP. Decision Support Systems, 43(2):313-321.
  11. Codd, E. (1993). Providing OLAP (on-line analytical processing) to user-analysis: an IT mandate. Technical report, E.F. Codd and Associates.
  12. Cormode, G. and Muthukrishnan, S. (2005). What's hot and what's not: tracking most frequent items dynamically. ACM Trans. Database Syst., 30(1):249-278.
  13. Donjerkovic, D. and Ramakrishnan, R. (1999). Probabilistic optimization of top n queries. In VLDB'99, pages 411-422.
  14. Feige, U. and Lee, J. R. (2007). An improved approximation ratio for the minimum linear arrangement problem. Inf. Process. Lett., 101(1):26-29.
  15. Gray, J., Bosworth, A., Layman, A., and Pirahesh, H. (1996). Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total. In ICDE 7896, pages 152-159.
  16. Green, T. J., Karvounarakis, G., Taylor, N. E., Biton, O., Ives, Z. G., and Tannen, V. (2007). ORCHESTRA: facilitating collaborative data sharing. In SIGMOD 7807, pages 1131-1133, New York, NY, USA. ACM.
  17. Hassan-Montero, Y. and Herrero-Solana, V. (2006). Improving tag-clouds as visual information retrieval interfaces. In InSciT'06.
  18. Havenstein, H. (2003). BI vendors seek to tap end-user power: New class of tools built to reap user knowledge for customizing analytic applications. InfoWorld, 22:20-21.
  19. Heer, J., Viégas, F. B., and Wattenberg, M. (2007). Voyagers and voyeurs: supporting asynchronous collaborative information visualization. In CHI 7807, pages 1029-1038.
  20. Hettich, S. and Bay, S. D. (2000). The UCI KDD archive. http://kdd.ics.uci.edu. [Online; accessed 21/12/2007]. (2007). Many http://services.alphaworks.ibm.com/manyeyes/ [Online; accessed 7-6-2007].
  21. Jaffe, A., Naaman, M., Tassa, T., and Davis, M. (2006). Generating summaries and visualization for large collections of geo-referenced photographs. In MIR 7806, pages 89-98.
  22. Johnson, D., Krishnan, S., Chhugani, J., Kumar, S., and Venkatasubramanian, S. (2004). Compressing large boolean matrices using reordering techniques. In VLDB'04, pages 13-23.
  23. Kaser, O. and Lemire, D. (2007). Tag-cloud drawing: Algorithms for cloud visualization. In WWW 2007 - Tagging and Metadata for Social Information Organization.
  24. Loh, Z., Ling, T., Ang, C., and Lee, S. (2002a). Adaptive method for range top-k queries in OLAP data cubes. In DEXA'02, pages 648-657.
  25. Loh, Z. X., Ling, T. W., Ang, C. H., and Lee, S. Y. (2002b). Analysis of pre-computed partition top method for range top-k queries in OLAP data cubes. In CIKM'02, pages 60-67.
  26. Luo, Z., Ling, T., Ang, C., Lee, S., and Cui, B. (2001). Range top/bottom k queries in OLAP sparse data cubes. In DEXA'01, pages 678-687.
  27. Maniatis, A., Vassiliadis, P., Skiadopoulos, S., Vassiliou, Y., Mavrogonatos, G., and Michalarias, I. (2005). A presentation model & non-traditional visualization for OLAP. International Journal of Data Warehousing and Mining, 1:1-36.
  28. Millen, D. R., Feinberg, J., and Kerr, B. (2006). Dogear: Social bookmarking in the enterprise. In CHI 7806, pages 111-120.
  29. Morzy, T. and Wrembel, R. (2004). On querying versions of multiversion data warehouse. In DOLAP 7804, pages 92-101.
  30. O'Neil, P. and Quass, D. (1997). Improved query performance with variant indexes. In SIGMOD 7897, pages 38-49.
  31. Poon, C. (2003). Dynamic orthogonal range queries in OLAP. Theoretical Computer Science, 296(3):487- 510.
  32. Rivadeneira, A. W., Gruen, D. M., Muller, M. J., and Millen, D. R. (2007). Getting our head in the clouds: toward evaluation studies of tagclouds. In CHI'07, pages 995-998.
  33. Russell, T. (2006). cloudalicious: folksonomy over time. In JCDL'06, pages 364-364.
  34. Seokjin, H., Moon, B., and Sukho, L. (2005). Efficient execution of range top-k queries in aggregate r-trees. IEICE - Transactions on Information and Systems, E88- D(11):2544-2554.
  35. Swivel, Inc (2007). Swivel. http://www.swivel.com. [Online; accessed 7-6-2007].
  36. Taylor, N. E. and Ives, Z. G. (2006). Reconciling while tolerating disagreement in collaborative data sharing. In SIGMOD 7806, pages 13-24, New York, NY, USA. ACM.
  37. Techapichetvanich, K. and Datta, A. (2005). Interactive visualization for OLAP. In ICCSA 7805, pages 206-214.
  38. Wattenberg, M. and Kriss, J. (2006). Designing for social data analysis. IEEE Transactions on Visualization and Computer Graphics, 12(4):549-557.
  39. Wu, P., Sismanis, Y., and Reinwald, B. (2007). Towards keyword-driven analytical processing. In SIGMOD 7807, pages 617-628.
Download


Paper Citation


in Harvard Style

Aouiche K., Lemire D. and Godin R. (2008). COLLABORATIVE OLAP WITH TAG CLOUDS - Web 2.0 OLAP Formalism and Experimental Evaluation . In Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8111-26-5, pages 5-12. DOI: 10.5220/0001515800050012


in Bibtex Style

@conference{webist08,
author={Kamel Aouiche and Daniel Lemire and Robert Godin},
title={COLLABORATIVE OLAP WITH TAG CLOUDS - Web 2.0 OLAP Formalism and Experimental Evaluation},
booktitle={Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2008},
pages={5-12},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001515800050012},
isbn={978-989-8111-26-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - COLLABORATIVE OLAP WITH TAG CLOUDS - Web 2.0 OLAP Formalism and Experimental Evaluation
SN - 978-989-8111-26-5
AU - Aouiche K.
AU - Lemire D.
AU - Godin R.
PY - 2008
SP - 5
EP - 12
DO - 10.5220/0001515800050012