Exploration and Visualization of Big Graphs - The DBpedia Case Study

Enrico G. Caldarola, Antonio Picariello, Antonio M. Rinaldi, Marco Sacco

Abstract

Increasingly, the data and information visualization is becoming strategic for the exploration and explanation of large data sets. The Big Data paradigm pushes for new ways, new technological solutions to deal with the big volume and the big variety of data today. Not surprisingly, a plethora of new tools have emerged, each of them with pros and cons, but all espousing the cause of "Bigness of Data". In this paper, we take one of this emerging tools, namely Neo4J, and stress its capabilities in order to import, query and visualize data coming from a \emph{big} case study: DBpedia. We will describe each step in this study focusing on the used strategies for overcoming the different problems mainly due to the intricate nature of the case study and its volume. We confront with both the intensional schema of DBpedia and its extensional part in order to obtain the best result in its visualization. Finally, an attempt to define some criteria to simplify the large-scale visualization of DBpedia will be made, providing some examples and considerations which have arisen. The ultimate goal of this work is to investigate techniques and approaches to get more insights from the visual representation and analytics of large graph databases.

References

  1. Bederson, B. B. and Shneiderman, B. (2003). The craft of information visualization: readings and reflections. Morgan Kaufmann.
  2. Bikakis, N. and Sellis, T. (2016). Exploration and visualization in the web of big linked data: A survey of the state of the art. arXiv preprint arXiv:1601.08059.
  3. Brunetti, J. M., Auer, S., and García, R. (2012). The linked data visualization model. In International Semantic Web Conference (Posters & Demos).
  4. Caldarola, E. G., Picariello, A., and Castelluccia, D. (2015). Modern enterprises in the bubble: Why big data matters. ACM SIGSOFT Software Engineering Notes, 40(1):1-4.
  5. Caldarola, E. G. and Rinaldi, A. M. (2015). Big data: A survey - the new paradigms, methodologies and tools. In Proceedings of 4th International Conference on Data Management Technologies and Applications, pages 362-370.
  6. Caldarola, E. G., Sacco, M., and Terkaj, W. (2014). Big data: The current wave front of the tsunami. ACS Applied Computer Science, 10(4):7-18.
  7. Cammarano, M., Dong, X., Chan, B., Klingner, J., Talbot, J., Halevy, A., and Hanrahan, P. (2007). Visualization of heterogeneous data. Visualization and Computer Graphics, IEEE Transactions on, 13(6):1200-1207.
  8. Cataldo, A. and Rinaldi, A. M. (2010). An ontological approach to represent knowledge in territorial planning science. Computers, Environment and Urban Systems, 34(2):117-132.
  9. Chan, B., Talbot, J., Wu, L., Sakunkoo, N., Cammarano, M., and Hanrahan, P. (2009). Vispedia: on-demand data integration for interactive visualization and exploration. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, pages 1139-1142. ACM.
  10. Eades, P. (1984). A heuristics for graph drawing. Congressus numerantium, 42:146-160.
  11. Fayyad, U. M., Wierse, A., and Grinstein, G. G. (2002). Information visualization in data mining and knowledge discovery. Morgan Kaufmann.
  12. Friendly, M. and Denis, D. J. (2001). Milestones in the history of thematic cartography, statistical graphics, and data visualization. URL http://www.datavis.ca/milestones.
  13. Fruchterman, T. M. and Reingold, E. M. (1991). Graph drawing by force-directed placement. Softw., Pract. Exper., 21(11):1129-1164.
  14. Gansner, E. R. and North, S. C. (1998). Improved forcedirected layouts. In Graph Drawing, pages 364-373. Springer.
  15. Helmich, J., Klímek, J., and Nec?askÈ, M. (2014). Visualizing rdf data cubes using the linked data visualization model. In The Semantic Web: ESWC 2014 Satellite Events, pages 368-373. Springer.
  16. Keim, D., Andrienko, G., Fekete, J.-D., Görg, C., Kohlhammer, J., and Melanc¸on, G. (2008). Visual analytics: Definition, process, and challenges. In Information visualization, pages 154-175. Springer.
  17. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P. N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., et al. (2015). Dbpediaa large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web, 6(2):167-195.
  18. Mazza, R. (2009). Introduction to information visualization. Springer Science & Business Media.
  19. Moscato, V., Picariello, A., and Rinaldi, A. M. (2010). A recommendation strategy based on user behavior in digital ecosystems. In Proceedings of the International Conference on Management of Emergent Digital EcoSystems, pages 25-32. ACM.
  20. Purchase, H. (1997). Which aesthetic has the greatest effect on human understanding? In Graph Drawing, pages 248-261. Springer.
  21. Rinaldi, A. M. (2008). A content-based approach for document representation and retrieval. In Proceedings of the eighth ACM symposium on Document engineering, pages 106-109. ACM.
  22. Rinaldi, A. M. (2009). An ontology-driven approach for semantic information retrieval on the web. ACM Transactions on Internet Technology (TOIT), 9(3):10.
  23. Rinaldi, A. M. (2014). A multimedia ontology model based on linguistic properties and audio-visual features. Information Sciences, 277:234-246.
  24. Robinson, I., Webber, J., and Eifrem, E. (2013). Graph databases. ” O'Reilly Media, Inc.”.
  25. Spence, R. (2001). Information visualization, volume 1. Springer.
  26. Sugiyama, K. (2002). Graph drawing and applications for software and knowledge engineers, volume 11. World Scientific.
  27. Vukotic, A., Watt, N., Abedrabbo, T., Fox, D., and Partner, J. (2015). Neo4j in Action. Manning.
  28. Ware, C. (2012). Information visualization: perception for design. Elsevier.
  29. Webber, J. (2012). A programmatic introduction to neo4j. In Proceedings of the 3rd annual conference on Systems, Programming, and Applications: Software for Humanity, pages 217-218. ACM.
Download


Paper Citation


in Harvard Style

G. Caldarola E., Picariello A., M. Rinaldi A. and Sacco M. (2016). Exploration and Visualization of Big Graphs - The DBpedia Case Study . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 257-264. DOI: 10.5220/0006046802570264


in Bibtex Style

@conference{kdir16,
author={Enrico G. Caldarola and Antonio Picariello and Antonio M. Rinaldi and Marco Sacco},
title={Exploration and Visualization of Big Graphs - The DBpedia Case Study},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)},
year={2016},
pages={257-264},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006046802570264},
isbn={978-989-758-203-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)
TI - Exploration and Visualization of Big Graphs - The DBpedia Case Study
SN - 978-989-758-203-5
AU - G. Caldarola E.
AU - Picariello A.
AU - M. Rinaldi A.
AU - Sacco M.
PY - 2016
SP - 257
EP - 264
DO - 10.5220/0006046802570264