Mixed Driven Refinement Design of Multidimensional Models based on Agglomerative Hierarchical Clustering

Lucile Sautot, Sandro Bimonte, Ludovic Journaux, Bruno Faivre

Abstract

Data warehouses (DW) and OLAP systems are business intelligence technologies allowing the on-line analysis of huge volume of data according to users’ needs. The success of DW projects essentially depends on the design phase where functional requirements meet data sources (mixed design methodology) (Phipps and Davis, 2002). However, when dealing with complex applications existing design methodologies seem inefficient since decision-makers define functional requirements that cannot be deduced from data sources (data driven approach) and/or they have not sufficient application domain knowledge (user driven approach) (Sautot et al., 2014b). Therefore, in this paper we propose a new mixed refinement design methodology where the classical data-driven approach is enhanced with data mining to create new dimensions hierarchies. A tool implementing our approach is also presented to validate our theoretical proposal.

References

  1. Bentayeb, F. (2008). K-means based approach for olap dimension updates. In 10th International Conference on Enterprise Information Systems (ICEIS), pages 531- 534.
  2. Briand, L. C., Morasca, S., and Basili, V. R. (2002). An operational process for goal-driven definition of measures. IEEE Trans. Software Eng., 28(12):1106-1125.
  3. Carme, A., Mazon, J.-N., and Rizzi, S. (2010). A modeldriven heuristic approach for detecting multidimensional facts in relational data sources. In Pedersen, T., Mohania, M., and Tjoa, A. M., editors, Proceedings of 12th International Conference on Data Warehousing and Knowledge Discovery (DaWaK), volume LNCS 6263, pages 13-24.
  4. Ceci, M., Cuzzocrea, A., and Malerba, D. (2011). Olap over continuous domains via density-based hierarchical clustering. In 15th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES 2011), volume 2, pages 559-570.
  5. Favre, C., Bentayeb, F., and Boussaid, O. (2006). A knowledge-driven data warehouse model for analysis evolution. Frontiers in Artificial Intelligence and Applications, 143:271.
  6. Jensen, M. R., Holmgren, T., and Torben (2004). Discovering multidimensional structure in relational data. In Data Warehousing and Knowledge Discovery: 6th International Conference (DaWaK).
  7. Jovanovic, P., Romero, O., Simitsis, A., and Abelló, A. (2012). Ore: An iterative approach to the design and evolution of multi-dimensional schemas. In Proceedings of the Fifteenth International Workshop on Data Warehousing and OLAP, DOLAP 7812, pages 1- 8, New York, NY, USA. ACM.
  8. Kimball, R. (1996). The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses. Wiley.
  9. Lenz, H.-J. and Thalheim, B. (2009). A formal framework of aggregation for the olap-oltp model. Journal of Universal Computer Science, 15(1):273-303.
  10. Leonhardi, B., Mitschang, B., Pulido, R., Sieb, C., and Wurst, M. (2010). Augmenting olap exploration with dynamic advanced analytics. In 13th International Conference on Extending Database Technology (EDBT 2010).
  11. Mahboubi, H., Ralaivao, J.-C., Loudcher, S., Boussaïd, O., Bentayeb, F., Darmont, J., et al. (2009). X-wacoda: an xml-based approach for warehousing and analyzing complex data. Data Warehousing Design and Advanced Engineering Applications: Methods for Complex Construction, pages 38-54.
  12. Messaoud, R. B., Boussaid, O., and Rabaséda, S. (2004). A new olap aggregation based on the ahc technique. In DOLAP 2004, ACM Seventh International Workshop on Data Warehousing and OLAP, pages 65-72.
  13. Miquel, M., Bdard, Y., and Brisebois, A. (2002a). Conception d'entrepts de donnes gospatiales partir de sources htrognes. exemple d'application en foresterie. Ingnieries des Systmes d'information, 7(3):89-111.
  14. Miquel, M., Bédard, Y., Brisebois, A., Pouliot, J., Marchand, P., and Brodeur, J. (2002b). Modeling multidimensional spatio-temporal data werehouses in a context of evolving specifications. International Archives Of Photogrammetry Remote Sensing And Spatial Information Sciences, 34(4):142-147.
  15. Nguyen, T. B. and Tjoa, A. M. (2000). An object oriented multidimensional data model for olap. In In Proc. of 1st Int. Conf. on Web-Age Information Management (WAIM), number 1846 in LNCS, pages 69-82. Springer.
  16. Phipps, C. and Davis, K. C. (2002). Automating data warehouse conceptual schema design and evaluation. In Proceedings of the 4th International Workshop on Design and Management of Data Warehouses (DMDW), volume 2.
  17. Romero, O. and Abello, A. (2009). A survey of multidimensional modeling methodologies. International Journal of Data Warehousing and Mining, 5(2):1-23.
  18. Romero, O. and Abello, A. (2010). Automatic validation of requirements to support multidimensional design. Data and Knowledge Engineering, 69:917-942.
  19. Sautot, L., Bimonte, S., Journaux, L., and Faivre, B. (2014a). A methodology and tool for rapid prototyping of data warehouses using data mining: Application to birds biodiversity. In Proceedings of 4th International Conference on Model & Data Engineering (MEDI). In Press.
  20. Sautot, L., Faivre, B., Journaux, L., and Molin, P. (2014b). The hierarchical agglomerative clustering with gower index: a methodology for automatic design of olap cube in ecological data processing context. Ecological Informatics. In Press.
Download


Paper Citation


in Harvard Style

Sautot L., Bimonte S., Journaux L. and Faivre B. (2015). Mixed Driven Refinement Design of Multidimensional Models based on Agglomerative Hierarchical Clustering . In Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-096-3, pages 547-555. DOI: 10.5220/0005404605470555


in Bibtex Style

@conference{iceis15,
author={Lucile Sautot and Sandro Bimonte and Ludovic Journaux and Bruno Faivre},
title={Mixed Driven Refinement Design of Multidimensional Models based on Agglomerative Hierarchical Clustering},
booktitle={Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2015},
pages={547-555},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005404605470555},
isbn={978-989-758-096-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Mixed Driven Refinement Design of Multidimensional Models based on Agglomerative Hierarchical Clustering
SN - 978-989-758-096-3
AU - Sautot L.
AU - Bimonte S.
AU - Journaux L.
AU - Faivre B.
PY - 2015
SP - 547
EP - 555
DO - 10.5220/0005404605470555