The decision-maker assigns a Data Warehouse
Concept to generic element Content (contenu in
Figure 5). This is the first dimension. Then, he/she
selects the generic element Year as the second
dimension (année in Figure 5), the generic element
Author (auteur in Figure 5) as third dimension. The
measure is the count of titles (titre in Figure 5).
To assign a concept to a generic element, the
system displays a list of all existing ontologies in the
warehouse to permit the decision-maker to choose
the appropriate ontology (cf. Figure 6).
Figure 6: Choice of ontology for Example 2.
The result of this analysis is displayed using a
multidimensional table as shown in Figure 7, where
the first column represents Years and the second
column gives the Number of articles for the Author
in the sheet heading.
Figure 7: Multidimensional table for Example 2
6 CONCLUSIONS
The document warehouse constitutes a solution to
exploit and analyze textual data extracted from
documents.
The meta-model proposed in this paper is suitable
for (1) storing heterogeneous documents according
to their structures and semantics, and (2) applying
the techniques of multidimensional analysis, i.e.
analyzing data according to several dimensions
through graphical language interfaces that offers a
high simplicity for users.
The perspectives that we intend to lead in order to
extend this work concern the following areas: (1)
determining semantic structures of the documents
integrated in the warehouse and gathering theses
structures into semantic classes, (2) defining the user
profiles from the parts of domain-ontologies (instead
of simple keywords), and (3) involving the user in
the construction process of document mart
(recommendation of potential analysis components
based on the user profiles and the relevant domain-
ontologies).
REFERENCES
Ben Messaoud, I., Feki, J., Khrouf, K., Zurfluh, G.,
Unification of XML Document Structures for DOCW.
In International Conference on Enterprise Information
Systems (ICEIS’11), p. 85-94, Beijing, China, 2011.
Boussaid, O., Ben Messaoud, R., Choquet, R., Anthoard,
S., 2006. X-Warehousing: An XML-Based Approach
for Warehousing ComplexData. In East European
Conference. on Advances in Databases and
Information Systems (ADBIS’06), p. 39–54.
Thessaloniki, Hellas.
Hachaichi, Y., Feki, J., Ben-Abdallah, H., 2009.
Designing Data Marts from XML and Relational Data
Sources. In Design and Advanced Engineering
Applications: Methods for Complex Construction,
Advances in Data Warehousing and Mining Series, GI
Global, p 55-80, Bellatreche Edition.
Khrouf K., Soulé-Dupuy C., 2004. A Textual Warehouse
Approach: a Web Data Repository, In Intelligent
Agents for Data Mining and Information Retrieval,
(Chapter VII), p. 101-124, Idea Group Publishing.
Ravat, F., Teste, O., Tournier, R., Zurfluh, G. 2008.
Top_Keyword: an Aggregation Function for Textual
Document OLAP. In International Conference on
Data Warehousing and Knowledge Discovery
(DaWaK’08), p. 55-64, Turin, Italiy.
Ravat, F., Teste, O., Tournier, R., Zurluh, G., 2010.
Finding an application-appropriate model for XML
data warehouses. In Information Systems, Vol. 35,
issue 6, p. 662-687, Elsevier.
Tseng, F., S., C., Chou, A., Y., H., 2006. The concept of
document warehousing for multi-dimensional
modeling of textual-based business intelligence. In
Decision Support Systems, Vol. 42, p. 727– 744,
Elsevier.
WEBIST2012-8thInternationalConferenceonWebInformationSystemsandTechnologies
154