Visualization and Clustering of Online Book Reviews

Shiaofen Fang, Lanfang Miao, Eric Lin

Abstract

Online user reviews of products, movies, books, etc. have been an important source of information for applications such as social networking, online retail, and sentiment analysis. In this paper, we present a novel visualization tool for analysing and visualizing online book reviews. Using text mining techniques, nontrivial features (tags) are identified on the text data extracted from the online reviews. These keyword tags are used to cluster both the books and the readers based on global tag similarities. Two different visualization methods are proposed: parallel coordinate views and 3D correlative cluster views. The parallel coordinate visualization provides a flat view of the tag distributions to reveal clustering patterns. A novel 3D corrective visualization technique is developed to visually represent the correlations of reader clusters and book clusters. These visualization techniques can also be applied to other types of online text data in social networks and web commerce.

References

  1. Feldman, R., Dagan, I., 1995. Knowledge discovery in textual databases (KDT). In Proceedings of the First International Conference on Knowledge Discovery and Data Mining. pp. 112-117.
  2. Feldman, R., I. Dagan, and H.Hirsh, 1998. Mining Text Using Keyword Distributions. In Journal of Intelligent Information Systems: Integrating Artificial Intelligence and Database Technologies, pp. 291-300.
  3. Salton, G., C. Buckley, 1988. Term-weighting approaches in automatic text retrieval. In Information Processing and Management, vol. 24, no. 5, pp. 513-523.
  4. Salton, G., 1989. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Reading, Addison-Wesley, Sahar, S., 1999. Interestingness via what is not interesting. In The Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 332- 336.
  5. Silberschatz, A., A. Tuzhilin, 1996. What makes patterns interesting in knowledge discovery systems. In IEEE Transactions on Knowledge and Data Engineering, vol. 8, issue 6, pp. 970-974.
  6. Oelke, D., P. Bak, D. Keim, M. Last, and G. Danon, 2008. Visual evaluation of text features for document summarization and analysis. In IEEE Symposium on Visual Analytics and Technology, pp. 75-82.
  7. You, Q., S. Fang, and P. Ebright, 2010. Iterative visual clustering for Unstructured Text Mining. In International Symposium on Biocomputing, Calcuit, Kalara, India.
  8. Pang, B., L. Lee, 2008. Opinion Mining and Sentiment Analysis. In Foundations and Trends in Information Retrieval, vol. 2, no. 1-2, pp. 1-135.
  9. Blitzer, J., M. Dredze, and F. Pereira, 2007. Biographies, Bollywood, Boom-Boxes, and Blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pp. 440-447.
  10. Wanner, F., J. Fuchs, D. Oelke, D. Keim, 2011. Are my children old enough to read these books? Age suitability Analysis. In Polibits, vol. 43, pp. 93-100.
  11. Chambers, J., 1983. Graphical Methods for Data Analysis, Chapman and Hall, New York.
  12. Inselberg, A., B. Dimsdale, 1990. Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proceedings of the 1st conference on Visualization 7890, IEEE Computer Society Press, Los Alamitos, CA, USA, pp. 361-378.
  13. Becker, R., W. S. Cleveland, 1987. Brushing scatterplots, Technmetrics 29 (2), 127-142.
  14. Jollie, I., 1986. Principal Component Analysis, Springer Verlag.
  15. Fukunaga, K., 1990. Introduction to Statistical Pattern Recognition, Academic Press, New York.
  16. Cox, T., M. Cox, 2000. Multidimensional Scaling, Second Edition, Chapman and Hall/CRC.
  17. Kohonen, T., M. R. Schroeder, T. S. Huang, 2001. SelfOrganizing Maps, Springer-Verlag New York, Inc., Secaucus, NJ, USA.
  18. Johansson, S., J. Johansson, 2009. Interactive dimensionality reduction through user-defined combinations of quality metrics. IEEE Transactions on Visualization and Computer Graphics, 15(6):993- 1000.
  19. You, Q., Fang, S., Chen, J, 2010. GeneTerrain: Visual Exploration of Differential Gene Expression Profiles Organized in Native Biomolecular Interaction Networks. Journal of Information Visualization, 9:1, 1-12.
  20. Choo, J., S. Bohn, H. Park, 2009. Two-stage framework for visualization of clustered high dimensional data, in: IEEE VAST, pp. 67-74.
  21. Nam, E., Y. Han, K. Mueller, A. Zelenyuk, D. Imre, 2007. Clustersculptor: A visual analytics tool for highdimensional data, in: IEEE VAST, IEEE, pp. 75-82.
  22. Henry, N., J.-D. Fekete, and M. J. McGuffin, 2007. Nodetrix: a hybrid visualization of social networks. IEEE Transactions on Visualization and Computer Graphics, 13(6):1302-1309.
  23. Seo, J. B. Shneiderman, 2002. Interactively exploring hierarchical clustering results. IEEE Computer, 35:80-86.
  24. Elmqvist, N., P. Dragicevic, and J. Fekete, 2008. Rolling the dice: multidimensional visual exploration using scatterplot matrix navigation. IEEE Transactions on Visualization and Computer Graphics, 14(6):1141- 1148.
  25. Lin, E., Fang, S., Wang, J., 2013. Mining Online Book Reviews for Sentimental Clustering. 2013 IEEE International Conference on Advanced Information Networking and Applications (AINA), Workshop on Data Mining and Social Networks, Barcelona, Spain.
  26. Alper, B., Yang, H., Haber, E., and Kandogan, E., 2011. OpinionBlocks: Visualizing Consumer Reviews, in IEEE VisWeek Workshop on Interactive Visual Text Analytics for Decision Making.
Download


Paper Citation


in Harvard Style

Fang S., Miao L. and Lin E. (2014). Visualization and Clustering of Online Book Reviews . In Proceedings of the 5th International Conference on Information Visualization Theory and Applications - Volume 1: IVAPP, (VISIGRAPP 2014) ISBN 978-989-758-005-5, pages 187-194. DOI: 10.5220/0004745501870194


in Bibtex Style

@conference{ivapp14,
author={Shiaofen Fang and Lanfang Miao and Eric Lin},
title={Visualization and Clustering of Online Book Reviews},
booktitle={Proceedings of the 5th International Conference on Information Visualization Theory and Applications - Volume 1: IVAPP, (VISIGRAPP 2014)},
year={2014},
pages={187-194},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004745501870194},
isbn={978-989-758-005-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Information Visualization Theory and Applications - Volume 1: IVAPP, (VISIGRAPP 2014)
TI - Visualization and Clustering of Online Book Reviews
SN - 978-989-758-005-5
AU - Fang S.
AU - Miao L.
AU - Lin E.
PY - 2014
SP - 187
EP - 194
DO - 10.5220/0004745501870194