Towards Analytical MD Stars from Linked Data

Victoria Nebot, Rafael Berlanga

Abstract

While the Linked Data (LD) initiative has given place to open, large amounts of semi-structured and rich data published on the Web, effective analytical tools that go beyond browsing and querying are still lacking. To address this issue, we propose the automatic generation of multidimensional (MD) analytical stars. The success of the MD model for data analysis has been in great part due to its simplicity. Therefore, in this paper we aim at automatically discovering MD conceptual patterns that summarize LD. These patterns resemble the MD star schema typical of relational data warehousing. Our method is based on probabilistic graphical models and makes use of the statistics about the instance data to generate the MD stars. We present a first implementation, and the preliminary results with large LD sets are encouraging to further work in this direction.

References

  1. Alzogbi, A. and Lausen, G. (2013). Similar structures inside rdf-graphs. In LDOW, volume 996 of CEUR Workshop Proceedings.
  2. Araújo, S. and Schwabe, D. (2009). Explorator: A tool for exploring rdf data through direct manipulation. In LDOW, volume 538 of CEUR Workshop Proceedings.
  3. Auer, S. and Lehmann, J. (2007). What have innsbruck and leipzig in common? extracting semantics from wiki content. In Proc. of the 4th European Conference on The Semantic Web: Research and Applications, ESWC 7807, pages 503-517, Berlin, Heidelberg.
  4. Berners-Lee, T., Chen, Y., Chilton, L., Connolly, D., Dhanaraj, R., Hollenbach, J., Lerer, A., and Sheets, D. (2006). Tabulator: Exploring and analyzing linked data on the semantic web. In Proceedings of the 3rd International Semantic Web User Interaction.
  5. Dadzie, A. and Rowe, M. (2011). Approaches to visualising linked data: A survey. Semant. web, 2(2):89-124.
  6. Etcheverry, L. and Vaisman, A. A. (2012). Enhancing olap analysis with web cubes. In ESWC, volume 7295 of Lecture Notes in Computer Science, pages 469-483.
  7. Heath, T. and Bizer, C. (2011). Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web. Morgan & Claypool Publishers.
  8. Heim, P., Lohmann, S., and Stegemann, T. (2010). Interactive relationship discovery via the semantic web. In Proc. of the 7th International Conference on The Semantic Web: Research and Applications - Volume Part I, ESWC'10, pages 303-317, Berlin, Heidelberg.
  9. Kämpgen, B. and Harth, A. (2011). Transforming statistical linked data for use in OLAP systems. In
  10. I-SEMANTICS 2011, Graz, Austria, September 7-9,
  11. Khatchadourian, S. and Consens, M. P. (2010). Explod: Summary-based exploration of interlinking and rdf usage in the linked open data cloud. In ESWC (2), volume 6089 of Lecture Notes in Computer Science, pages 272-287. Springer.
  12. Kimball, R. and Ross, M. (2011). The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley.
  13. Klimek, J., Helmich, J., and Neask, M. (2013). Payola: Collaborative linked data analysis and visualization framework. In The Semantic Web: ESWC 2013 Satellite Events, volume 7955 of Lecture Notes in Computer Science, pages 147-151.
  14. Klyne, G. and Carroll., J. J. (2004). Resource description framework (RDF): Concepts and abstract syntax.
  15. Nebot, V. and Berlanga, R. (2012). Building data warehouses with semantic web data. Decision Support Systems, 52(4):853-868.
  16. Prudhommeaux, E. and Seaborne, A. (2008). SPARQL query language for RDF.
  17. Schraefel, m. c., Shadbolt, N. R., Gibbins, N., Harris, S., and Glaser, H. (2004). CS AKTive Space: Representing computer science in the semantic web. In WWW, pages 384-392, New York, NY, USA. ACM.
  18. Stadler, C., Lehmann, J., Höffner, K., and Auer, S. (2012). Linkedgeodata: A core for a web of spatial open data. Semantic Web Journal, 3(4):333-354.
  19. Zhang, X., Cheng, G., and Qu, Y. (2007). Ontology summarization based on rdf sentence graph. In WWW, pages 707-716. ACM.
  20. Zviedris, M. and Barzdins, G. (2011). Viziquer: A tool to explore and query sparql endpoints. In The Semanic Web: Research and App., volume 6644 of Lecture Notes in Computer Science, pages 441-445.
Download


Paper Citation


in Harvard Style

Nebot V. and Berlanga R. (2014). Towards Analytical MD Stars from Linked Data . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014) ISBN 978-989-758-048-2, pages 117-125. DOI: 10.5220/0005128701170125


in Bibtex Style

@conference{kdir14,
author={Victoria Nebot and Rafael Berlanga},
title={Towards Analytical MD Stars from Linked Data},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014)},
year={2014},
pages={117-125},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005128701170125},
isbn={978-989-758-048-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014)
TI - Towards Analytical MD Stars from Linked Data
SN - 978-989-758-048-2
AU - Nebot V.
AU - Berlanga R.
PY - 2014
SP - 117
EP - 125
DO - 10.5220/0005128701170125