helping to solve semantic heterogeneities (out of the
scope of this paper), and defining the Extracting,
Transforming and Loading processes (ETL).
The integration processor consist of two modules
which have been added to the reference architecture
in order to carry out the integration of the temporal
and spatial properties of data, considering the data
source extraction method used:
The Temporal and Spatial Integration Processor
uses the set of semantic relations and the conformed
schemes obtained during the detection phase of
similarities (Oliva and Saltor, 1996). This phase is
part of the integration methodology of data schemes.
As a result, we obtain data in form of rules about the
integration possibilities existing between the
originating data from the data sources (minimum
resultant granularity…).
The Metadata Refreshment Generator
determines the most suitable parameters to carry out
the refreshment of data in the DW scheme. As result,
the DW scheme is obtained along with the
Refreshment Metadata necessary to update the DW
according to the data extraction method and other
spatio-temporal properties of a the data sources.
Data Warehouse Refreshment. After the schema
integration and once the DW scheme is obtained, its
maintenance and update will be necessary. Each
Data Integration Processor is responsible of doing
the incremental capture of its corresponding data
source and transforming them to solve the semantic
heterogeneities. Each Data Integration Processor
accesses to its corresponding data source according
to the temporal and spatial requirements obtained in
the integration stage. A global Data Integrator
Processor uses a parallel, fuzzy data integration
algorithm to integrate the data (Araque et al.,
2007b).
5 CONCLUSIONS
In this paper we have presented a DW architecture
for the integration on the basis of the temporal and
spatial properties of the data as well as the temporal
and the spatial characteristics of the data sources.
We have described the modules introduced to the
Sheth and Larson reference architecture. These
modules are responsible of checking the temporal
and the spatial parameters of data sources and
determine the best refreshing parameters according
to the requirements. This parameters will be used to
design the DW refreshment process, made up by the
extracting, transforming and loading data processes.
We used STOWL as the Canonical Data Model.
All the data sources schemes will be translated to
this one. STOWL is an OWL extension including
spatial, temporal and metadata elements for the
precise definition of the extracting, transforming and
loading data processes.
ACKNOWLEDGEMENTS
This work has been supported by the
Research Program under project GR2007/07-2 and
by the Spanish Research Program under projects
EA-2007-0228 and TIN2005-09098-C05-03.
REFERENCES
Araque, F., Salguero, A., Delgado, C., 2007a. Monitoring
web data sources using temporal properties as an
external resources of a data warehouse. ICEIS. 28-35.
Araque, F., Carrasco, R. A., Salguero, A., Delgado, C.,
Vila, M. A., 2007b. Fuzzy Integration of a Web data
sources for Data Warehousing. Lecture Notes in
Computer Science (Vol 4739). Springer-Verlag.
Araque, F., Salguero, A.., Delgado, C., Samos, J., 2006.
Algorithms for integrating temporal properties of data
in DW. 8th Int. Conf. on Enterprise Information
Systems (ICEIS). Paphos, Cyprus. May.
Haller, M., Pröll, B. , Retschitzegger, W., Tjoa, A. M.,
Wagner, R. R., 2000. Integrating Heterogeneous
Tourism Information in TIScover - The MIRO-Web
Approach. Information and Communication
Technologies in Tourism, ENTER. Barcelona (Spain)
Inmon W.H, 2002. Building the Data Warehouse. John
Wiley.
Oliva, M., Saltor, F., 1996. A Negotiation Process
Approach for Building Federated Databases. In
Proceedings of 10th ERCIM Database Research Group
Workshop on Heterogeneous Information
Management, Prague. 43–49.
Sheth, A., Larson, J., 1990. Federated Database Systems
for Managing Distributed, Heterogeneous and
Autonomous Databases.” ACM Computing Surveys,
Vol. 22, No. 3.
Skotas, D., Simitsis, A., 2006. Ontology-Based
Conceptual Design of ETL Processes for Both
Structured and Semi-Structured Data. International
Journal on Semantic Web and Information Systems,
Vol. 3, Issue 4. pp. 1-24.
Thalhammer, T., Schrefl, M., Mohania, M., 2001. Active
data warehouses: complementing OLAP with analysis
rules, Data & Knowledge Engineering, 39 (3), 241-
269.
Vassiliadis, P., Quix, C., Vassiliou, Y., Jarke, M., 2001.
Data warehouse process management. Information
System. v. 26. 205-236.
ICEIS 2008 - International Conference on Enterprise Information Systems
500