Ines Ben Messaoud, Jamel Feki, Kais Khrouf, Gilles Zurfluh


Data warehouses and OLAP (On Line Analytical Processing) technologies analyse huge amounts of structured data that companies store as conventional databases. Recent works underline the importance of textual data for the decision making process and, therefore, lead to build document warehouses. In fact, documents help decision makers to better understand the evolution of their business activities. In general, these documents exist in XML format, are geographically distributed and described by multiple and different structures. This paper deals with a method to build a distributed document warehouse. This method consists of two steps: i) unification of XML document structures in order to set a global and generic perception/view of the distributed document warehouse, and ii) multidimensional modeling of unified documents for decisional purposes. More specifically, this paper focuses on the unification step.


  1. Ben Messaoud, I., Feki, J., Zurfluh, G., 2010. Unification des structures des documents XML pour l'entreposage de documents. In ASD'10, Cinquième Atelier sur les Systèmes Décisionnels, pages 1-12, ISBN 9973-9900- 2-0, Sfax, Tunisie.
  2. Ben Messaoud, I., Feki, J., Zurfluh, G., 2011. Modélisation multidimensionnelle des documents XML. In EDA'11, 7ème journée francophones sur les Entrepôts de Données et d'Analyse en ligne, Clermont Ferrand, France (To appear).
  3. Boussaid, O., Ben Messaoud, R., Choquet, R., Anthoard, S., 2006. Conception et construction d'entrepôts en XML. In EDA'06, 2ème journée francophone sur les Entrepôts de Données et l'Analyse en ligne, Versailles, France.
  4. Golfarelli, M., Maio, D., Rizzi, S., 1998. Conceptual Design of Data Warehouses from E/R Schema. In HICSS'98, Proceedings of the 31st Annual Hawaii International Conference on System Sciences, IEEE Computer Society, pages 334-343, Washington, DC, USA.
  5. Golfarelli, M., Rizzi, S., 1999. Designing the Data Warehouse : Key Steps and Crucial Issues. In Journal of Computer Science and Information Management 2(3), pages 88-100.
  6. Feki, J., 2004. Vers une conception automatisée des entrepôts de données : Modélisation des besoins OLAP et génération de schémas multidimensionnels. In MCSEAI'04, 8th Maghrebian Conference on Software Engineering and Artificial Intelligence, pages 473-485, ISBN 9973-37-193-3, Sousse, Tunisie.
  7. Hachaichi, Y., Feki, J., Ben-Abdallah, H., 2010. Modélisation multidimensionnelle de documents XML centrés-données. In Journal of Decison Systems, vol 19/3, pages 313-345.
  8. Inmon, W., H., 1994. Building the Data Warehouse. John Wiley & Sons.
  9. Júnior, C., A., S., Mello, R., S., 2008. An ontology-driven process for unification of XML instances. In Brazilian Symposium on Multimedia and the Web, pages 242- 249, Vila Velha, Brazil.
  10. Lee, M., L., Yang, L., H., Hsu, W., Yang, X., 2002. XClust: clustering XML schemas for effective integration. In CIKM'02, Proc. of the ACM International Conference on Information and Knowledge Management, pages 292-299, McLean, Virginia,
  11. McCabe, M., C., Lee, J., Chowdhury, A., Grossman, D., Frieder, O., 2000. On the design and evaluation of a multi-dimensional approach to information retrieval. In Proceedings of the 23th Annual International ACM SIGIR Conference, pages 363-365.
  12. Pedersen, T., B., 2009. Warehousing The World: A vision for Data Warehouse Research. In Kozielski S., Wrembel R. (Eds.): New Trends in Data Warehousing and Data Analysis. Annals of Information Systems, Vol.3.
  13. Pérez, M., J., M., 2007. Contextualizing a data warehouse with documents. Thèse de doctorat. Université Jaume I, Spain.
  14. Pérez, M., J., M., Berlanga, L., M., R., Aramburu, C., M., J., Pederson, T., B., 2008. Contextualizing data warehouses with documents. In Decision Support System (DSS), Elsevier, pages 77-94.
  15. Ravat, F., Teste, O., Tournier, R., Zurluh, G., 2010. Finding an application-appropriate model for XML data warehouses. In Information Systems, volume 35, issue 6, pages 662-687.
  16. Tseng, F., S., C., Chou, A., Y., H., 2006. The concept of document warehousing for multi-dimensional modeling of textual-based business intelligence. In Decision Support Systems (DSS), vol 42, Elsevier, pages 727- 744.
  17. Yoo, C., S., Woo, S., M., Kim, Y., S., 2005. Unification of XML DTD for xml Documents with Similar Structure. In Computational Science and its Applications - ICCSA, LNCS 3482, pages 954-963.

Paper Citation

in Harvard Style

Ben Messaoud I., Feki J., Khrouf K. and Zurfluh G. (2011). UNIFICATION OF XML DOCUMENT STRUCTURES FOR DOCUMENT WAREHOUSE (DocW) . In Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-8425-53-9, pages 85-94. DOI: 10.5220/0003502100850094

in Bibtex Style

author={Ines Ben Messaoud and Jamel Feki and Kais Khrouf and Gilles Zurfluh},
booktitle={Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},

in EndNote Style

JO - Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
SN - 978-989-8425-53-9
AU - Ben Messaoud I.
AU - Feki J.
AU - Khrouf K.
AU - Zurfluh G.
PY - 2011
SP - 85
EP - 94
DO - 10.5220/0003502100850094