all the year has to be separated from the full date. On
the other hand in real life situation in the fact table
the foreign keys to the time dimension are stored.
But in the time dimension usually is a column called
“Year”. This is a reason why we can avoid the
function year. In any case it is not the difficult one
as function that extracts year from full date usually
is built in DBMS. The mandatory function in this
query is the one that can extract day and month from
the full date. In this example we are calling it
day_month. In a lot of situations it can be avoided
using substrings, concatenations and data type
translation functions. A little bit similar is the
function between. This function in DBMS level is
defined for built in data types like date, integer, real.
This is why it should be overloaded in a way that
between can work with partial date as the year this
time is not important.
The results from query in code (3) are presented
in table 2.
Table 2: Company’s employees’ count change dynamic
calculated on each year 1
st
of July.
When the fact splitting is used, the record count
in fact table is growing. It is not possible to predict
the percentage of growth for all situations. For each
fact table it can be very different. It depends on the
nature of facts, that is, how slow evolving they are.
The second point is, how long is the period, from
which we want to take the value for making
dynamics. It can be day, hour, month or year. In our
project, where the contracts of employment are
stored, the record count grew about three times.
Other benefit, which can be got by combining two
different facts into one fact table, is less storage
space. This is not only because we do not need to
make almost the same fact tables with almost the
same primary keys. Also the indexes should not be
duplicated.
6 CONCLUSION
We introduced solutions for implementing splitted
facts using weights in this paper. There were also
given an example of how to use such structures in
real world situation. However, fact splitting can be
used in other situations as well. One of them is fact
splitting in bitemporal data warehouses. The facts
can be splitted accordingly to transaction time and
validity time overlapping intervals. In that way we
could analyze data looking at events history from
different perspectives.
Other situation when splitting facts could be
appropriate solution is having inconsistent data from
different data sources that has to be integrated. Such
situations should be researched more closely.
REFERENCES
Abelló, A., Samos, J., and Saltor, F. 2001. Understanding
facts in a multidimensional object-oriented model. In:
Proc. of the 4th ACM international Workshop on Data
Warehousing and OLAP. ACM Press, 32-39.
Abelló, A., Martin C. 2003. A Bitemporal Storage
Structure for a Corporate Data Warehouse. In: Proc.
of the 5th Int. Conf. of Enterprise Information Systems
(ICEIS 2003), 177-183.
Chen, C., Cochinwala, M., and Yueh, E. 1999. Dealing
with slow-evolving fact: a case study on inventory
data warehousing. In: Proc. of the 2nd ACM
international Workshop on Data Warehousing and
OLAP. ACM Press, 22-29.
Eder, J. , Koncilia, C. , and Kogler, H. 2002. Temporal
data warehousing: business cases and solutions. In:
Proc. of the International Conference on Enterprise
Information Systems (ICEIS'02), Spain, 81-88.
Hüsemann, B. , Lechtenbörger, J., and Vossen., G. 2000.
Conceptual Data Warehouse Design. In Proc. of the
International Workshop on Design and Management
of Data Warehouses (DMDW 2000), CEUR-WS
(www.ceur-ws.org).
Inmon, W.H.. 1996. Building the Data Warehouse, John
Wiley.
Jarke, M., Lanzerini, M., and Vassiliou, Y. 2002.
Fundamentals of Data Warehouses, Berlin:Springer.
Kimball, R. 1996. The Data Warehouse Toolkit: Practical
Techniques for building Dimensional Data
Warehouses. Jon Wiley & Sons.
Kimball, R., Reeves, L., Ross, M., and Tthornthwaite, W.,
1998. The Data Warehouse Lifecycle Toolkit: Expert
Methods for Designing, Developing and Deploying
Data Warehouses, New York: Jon Wiley & Sons.
Kimball, R. and Ross M., 2002. The Data Warehouse
Toolkit: The Complete Guide to Dimensional
Modeling, John Wiley.
Song, I.-Y. , Rowen, W. , Medsker, C. , and Ewen,E.
2001. An Analysis of Many-to-Many Relationships
Between Fact and Dimension Tables in Dimensional
Modeling. In Proc. of the Int. Workshop on Design
and Management of Data Warehouses (DMDW'2001).
CEUR-WS (www.ceur-ws.org).
ICEIS 2006 - DATABASES AND INFORMATION SYSTEMS INTEGRATION
258