queries execution time. Results show that DDW
gives better performance for all the specific queries
compared to the centralized context. Summed
specific queries execution time is reduced by 82%.
In figure 5, we present general queries execution
time. Results show that DDW gives better
performance for some general queries but not for
others. For query 1 (sales analysis by product by
year) and query 5 (sales analysis by quarter by year),
the execution times are elevated compared to the
centralized context. The specificity of those queries
is that they are based on JOIN operations between
all fragments localized at different distant
geographical sites. In this case, the execution time is
elevated compared to a CDW however the summed
general queries execution time is reduced by 76%.
Figure 5: General queries execution time.
6 CONCLUSIONS
The design of DDW is an optimization problem
requiring solutions to several interrelated problems
including: data fragmentation, fragment allocation,
and local optimization. Each problem can be solved
with several different techniques, thereby making
the distributed database design a very difficult task.
Although there are many researches on the design of
data fragmentation, most of them are focused on the
centralized context and no considerations are given
to the distributed allocation problem. Our work is
considered one of the few dealing with data
fragmentation and fragment allocation for the
decentralization purpose. In this paper, we adapt a
DW fragmentation approach using the predicate
construction technique to generate the predicate list
and the primary and derived horizontal technique to
fragment dimension and fact tables. Then, we study
three fragment allocation techniques into a
distributed environment. First, we allocate fragment
according to the simple allocation technique. Then,
we replicate fragment there were they are used
according to the fragment allocation with replication
technique. Finally, we revise some fragment
replication using the allocation with some fragment
replication technique. After that, we conduct
computing evaluation by using a DWFE and we
compare the three fragment allocation cases. Finally,
we implement our approach on a real DW. Results
demonstrate that DW decentralisation gives better
performance when data storage is distributed trough
the company sites. But, the execution time for
queries which are based on JOIN operations between
all distributed fragments is higher than in a
centralized context. As future work we intend to
study OLAP distant queries optimization in a DW
distributed context.
REFERENCES
Bellatreche, L., & Boukhalfa, K., 2005, An Evolutionary
Approach to Schema Partitioning Selection in a Data
Warehouse, In DAWAK’07, 7th International
Conference on Data Warehousing and Knowledge
Discovery. Volume 3589 of LNCS,
Chakravarthy S., et al., 1992. An objective function for
vertically partitioning relations in distributed databases
and its analysis, Distributed and Parallel Databases
journal.
Ciferri, C. D. A. & Souza, F., 2002, Focusing on Data
Distribution in the WebD2W System. In DAWAK’02,
4th International Conference on Data Warehousing
and Knowledge Discovery, Vol. 2454 of LNCS.
Ciferri, C. D. A., et al., 2007. Horizontal fragmentation as
a technique to improve the performance of drill-down
and roll-up queries. In SAC‘07, 22
nd
ACM Symposium
on Applied Computing.
Costa, M. & Madeira, H., 2004. Handling Big Dimensions
in Distributed Data Warehouses using the DWS
Technique. In DOLAP’04, 7th ACM Eleventh
International Workshop on Data Warehousing and
OLAP.
Datta, A. & Ramamritham, K., & Thomas, H. M., 1999.
Curio: A Novel Solution for Efficient Storage and
Indexing in Data Warehouses. In VLDB’99, 25th
International Conference on Very Large Data Bases.
Furtado, P., 2004. Workload-based Placement and Join
Processing in Node-Partitioned Data Warehouses. In
DaWaK’04, 6
th
International Conference on Data
Warehousing and Knowledge Discovery., p. 38-47.
Golfarelli, M., Maio, D. & Rizzi, S., 1999. Vertical
fragmentation of views in relational data warehouses.
SEDB’99. In Settimo Convegno Nazionale su Sistemi
Evoluti Per Basi Di Dati.
Jain, A. & Dubes, R., 1988. Agorithms for clustering Data.
Prentice Hall Advanced Reference Series, Englewood
Cliffs, NJ.
EXPERIMENTAL EVIDENCE ON DATA WAREHOUSE FRAGMENTATION AND ALLOCATION IN A
DISTRIBUTED CONTEXT
109