A HYBRID CLUSTERING CRITERION FOR R*-TREE
ON BUSINESS DATA
Yaokai Feng, Zhibin Wang, Akifumi Makinouchi
The Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan
Keywords: Multidimensional indices, R*-tree, clustering
criterion, Multidimensional range query, TPC-H.
Abstract: It is well-known that multidimensional indices are efficient to improve the query performance on relational
data. As one successful multi-dimensional index structure, R*-tree, a famous member of the R-tree family,
is very popular. The clustering pattern of the objects (i.e., tuples in relational tables) among R*-tree leaf
nodes is one of the deceive factors on performance of range queries, a popular kind of queries on business
data. Then, how is the clustering pattern formed? In this paper, we point out that the insert algorithm of R*-
tree, especially, its clustering criterion of choosing subtrees for new coming objects, determines the
clustering pattern of the tuples among the leaf nodes. According to our discussion and observations, it
becomes clear that the present clustering criterion of R*-tree can not lead to a good clustering pattern of
tuples when R*-tree is applied to business data, which greatly degrades query performance. After that, a
hybrid clustering criterion for the insert algorithm of R*-tree is introduced. Our discussion and experiments
indicate that query performance of R*-tree on business data is improved clearly by the hybrid criterion.
1 INTRODUCTION
More and more applications need processing
multidimensional range queries on business data
usually stored in relational tables. For example,
Relational On-Line Analytical Processing in data
warehouse is required to answer complex and
various types of range queries on large amount of
such data. In order to obtain good performance for
such multidimensional range queries, multi-
dimensional indices are helpful (V. Markl and Bayer,
1999a; V. Markl and Bayer, 1999b), in which the
tuples are clustered among the leaf nodes to restrict
the nodes to be accessed for queries.
So many index structures exist. Among them,
R*
-tree (Beckmann and Kriegel, 1990) is one of the
well-known and successful ones, and widely used in
many applications and researches (C. Chung and
Lee, 2001; D. Papadias and Delis, 1998; H.
Horinokuchi and Makinouchi, 1999; H. P. Kriegel
and Schneider, 1993; Jurgens and Lenz, 1998). R*-
tree is also used in this study. Anyway, we want to
note that our proposal in this study can also be used
to other hierarchical index structures, including the
other members of R-tree family.
In the works (C. Chung and Lee, 2001; Kotidis
and
N. Roussopoulos, 1998; Jurgens and Lenz 1998;
N. Roussopoulos and Y. Kotidis, 1997; S. Hon and
Lee, 2001), the aggregate values are pre-computed
and stored in a multidimensional index as
materialized view. When required, the aggregate
values can be retrieved efficiently. In this study, we
also use a multidimensional index for relational data.
However, it is completely different from the related
works in that our study focuses on enhancing R*-
tree to speed up evaluation of range queries
themselves.
In this paper, it is pointed out that the clustering
pat
tern of tuples among the leaf nodes is a decisive
factor on search performance. But, there exist many
very slender leaf nodes when R*-tree is used to
index business data, which greatly degrades query
performance. Slender nodes mean the nodes whose
MBRs (Minimum Bounding Rectangle) have at least
one very narrow side (even the side length is zero) in
some dimension(s). Clearly, slender nodes have very
small, even 0, areas (volumes in 3 or more
dimensional spaces. Note that, area and volume are
used interchangeably in this paper). Some examples
are those MBRs roughly shaped as line segments in
2-dimensional spaces and roughly shaped as plane or
line segments in 3-dimensional spaces.
According to our discussion in this paper, the
reaso
n of so many slender leaf nodes existing
becomes clear. The insert algorithm of R*-tree,
especially, its criterion (called clustering criterion)
346
Feng Y., Wang Z. and Makinouchi A. (2005).
A HYBRID CLUSTERING CRITERION FOR R*-TREE ON BUSINESS DATA.
In Proceedings of the Seventh International Conference on Enterprise Information Systems, pages 346-352
DOI: 10.5220/0002552703460352
Copyright
c
SciTePress
of choosing subtrees for new coming tuples,
determines the clustering pattern of tuples among the
leaf nodes. After that, we make it clear that the
present clustering criterion in the insert algorithm of
R*-tree is not suitable to R*-tree applied to business
data. Instead, a hybrid clustering criterion is
proposed. Our discussion and experiment indicate
that query performance of R*-tree on business data
is improved much by the new clustering creation.
The rest of the paper is organized as follows.
Section 2 describes how to use multidimensional
indices for relational data. Section 3 presents our
observations when R*-tree is used to business data
and the reason of our observations is discussed in
detail. Section 4 is our proposal: a hybrid clustering
criterion for R*-tree. Section 5 gives experimental
result, and Section 6 concludes the paper.
2 INDEXING BUSINESS DATA
USING R*-TREE
In this section, let us see how to use R*-tree to
business data and give some terms. Due to the
limitation of pages, R*-tree is not introduced in this
paper. Readers can refer the works (Beckmann and
Kriegel, 1990, Y. Feng, A. Makinouchi and H. Ryu,
2004).
Let T be a relational table with n attributes,
denoted by T(A1, A2, , An). Attribute Ai (1 i
n) has domain D(Ai), a set of possible values for Ai.
The attributes often have types such as Boolean,
integer, floating, character string, date, and so on.
Each tuple t in T is denoted by <a1,a2, ,an>,
where ai (1 i n) is an element of D(Ai).
When R*-tree is used in relational tables, some
of the attributes are usually chosen as index
attributes, which are used to build R*-tree. For
simplification of description, it is supposed without
loss of generality that the first k (1 k n) attributes
of T, <A1,A2, ,Ak>, are chosen as index
attributes. Since R*-tree can only deal with numeric
data, an order-preserving transformation is necessary
for each non-numeric index attributes. After
necessary transformations, the k index attributes
form a k-dimensional space, called index space,
where each tuple of T corresponds to one point.
It is not difficult to find such a mapping
function for Boolean attributes and date attributes (Y.
Feng, A. Makinouchi and H. Ryu, 2004). The work
(H. V.Jagadish and Srivastava, 2000) proposes an
efficient approach that maps character strings to real
numeric values within [0,1], where the mapping
preserves the lexicographic order. This approach is
also used in this study to deal with attributes of
character string.
We call the value range of Ai, [li, ui] (1 i k)
data range of Ai, an index attribute (in this paper,
“dimension” and “index attribute” are used
interchangeably). The length of the data range of Ai,
|ui-li|, is denoted by R(Ai). The k-dimensional
hyper-rectangle, [l1,u1]× [l2,u2]××[lk, uk], forms
the index space. Attributes specified in the range
query condition is called query attributes.
If R*-tree is used to index business data stored
in a relational table, all the tuples are clustered in
R*-tree leaf nodes. See Figure 1.
leaf nodes
query range
Figure 1. Leaf nodes and query range.
tuple
Figure 1: Leaf nodes and query range
Figure 1 shows an example of leaf nodes and query
range. Query range, given by user, refers to the
region, where the user wants to find the result.
Clearly, from Figure 1, if the tuples are properly
clustered among the leaf nodes, the number of leaf
nodes to be accessed for this range query will drop.
Thus, the clustering pattern is a deceive factor on
query performance. The question is that who decides
the clustering pattern? The answer is “clustering
criterion” in the insert algorithm of R*-tree.
R*-tree is constructed by inserting the objects
one by one. In constructing procedure, the insert
algorithm has to choose a proper subtree to contain
each new-coming tuple. The criterion that decides
which subtree should be chosen is called insert
criterion or clustering criterion in this paper. Of
course, for a given dataset, this criterion decides the
final clustering pattern of the tuples among leaf
nodes. In this paper, it will be pointed out that the
present clustering criterion of R*-tree cannot lead to
a proper clustering pattern when R*-tree is used to
business data. And a novel clustering criterion will
be proposed.
3 OBSERVATIONS AND OUR
EXPLANATION
In this section is our observations on R*-tree used
for business data. And, the observations are also
explained.
A HYBRID CLUSTERING CRITERION FOR R*-TREE ON BUSINESS DATA
347
3.1 Observations
Just as pointed out in our other work (Y. Feng, A.
Makinouchi and H. Ryu, 2004), because of the
particularity of business data, some new features
occur when R*-tree is used to index business data.
As a feature of business data, the data ranges of
attributes are very different from each other. For
instance, the data range of “Year” from 1990 to
2003 is only 13 while the amount of “Sales” for
different “Product” may be up to several hundreds of
thousands.
Another typical example of such attributes with
small cardinalities is Boolean attribute, which has
inherently only two possible values. Attributes with
other data type may also semantically have small
cardinality (e.g., “Weekday” with seven values). In
LINEITEM table of TPC-H benchmark,
RETURNFLAG, SHIPINSTRUCT, and
SHIPMODE have only 3, 4, and 7 distinct values,
respectively, although their data type is character
string.
Figure 2 shows an example in 2-dimensional
space.
x-axis (floating value)
y-axis (3 values)
Figure 2. Tuples in index space.
In Figure 2, y-axis has only 3 different values. On
the contrary, x-axis type is floating and has many
possible values. Thus, the tuples (black dots) are
distributed in lines.
In order to investigate the slender nodes in R*-tree
used in business data, using the LINEITEM table in
TPC-H benchmark, an R*-tree was constructed and
all the areas (or say volumes) of the leaf nodes are
computed. Totally 200,000 tuples are generated in
this table having 16 attributes. Six attributes,
SHIPDATE (date), QUANTITY (floating),
DISCOUNT (floating), SHIPMODE (character
string), SHIP-INSTRUCT (character string), and
RETURNFLAG (character string), are selected as
index attributes since they are often used as query
attributes in the queries of the benchmark. The page
size of our system is 4KB and each leaf node can
contain at most 77 tuples. The R*-tree has 4 levels
with 4649 leaf nodes. We observe that, 2930 of these
4649 leaf nodes have 0-area. Over 60%! And, there
are still many leaf nodes have only very-small areas.
We also use 200,000 6-dimensional synthetic
data with Zipf distribution to investigate existing of
slender nodes. The observation is very similar. Zipf
distribution is often used in the researches related to
business data (S. Hong, B. Song and S. Lee. 2001).
Certainly, the basic reason that slender nodes
exist is he distribution of tuples in the index space.
3.2 The Existing Clustering Criterion
in R*-tree
Since the clustering criterion is so important on the
clustering pattern of tuples among leaf nodes of R*-
tree (which is one of deceive factors on query
performance) and this study tries to introduce a new
clustering criterion, let us briefly recall the present
clustering criterion of R*-tree as follows.
A new-coming tuple will be inserted in the
node (subtree) at the current level with
1) (for leaf level) the least enlargement of overlap
area, if tie occurs then
2) the least enlargement of MBR area, if tie occurs
again then
3) the least MBR area.
This criterion means that, if the new tuple
reaches at the leaf level, the new-coming tuple is
tried to enter each node and the enlargement of
overlap area in each case among the leaf nodes is
calculated. And the node with the least enlargement
of overlap area is chosen to contain the new-coming
tuple. If several nodes have the least enlargement
then, the enlargement of MBR area in each case is
calculated and the node with the least enlargement
of MBR area is chosen. If tie occurs again then the
node with the smallest MBR area is chosen. If tie
still occurs, then arbitrary one of those nodes with
the smallest MBR area is chosen. For the
intermediate level, the area enlargement of overlap
among the nodes is not calculated and only (2) and
(3) in the criterion are used.
Figure 2: Tuples index space
In the next subsection, we will know that the
existing of slender leaf nodes is a “positive
feedback”. That is, once some slender leaf nodes
exist, they will become more and more as the new
tuples are inserted, which greatly deteriorates search
performance.
3.3 Positive Feedback
Let us consider the insertion algorithm of R*-tree,
using the example depicted in Figure 3 (a). Node A
ICEIS 2005 - DATABASES AND INFORMATION SYSTEMS INTEGRATION
348
is a slender node and point p is to be newly inserted.
Certainly it should be inserted in Node B since it is
so nearer to Node B than to Node A. However,
according to the insert algorithm of R*-tree, p will
be inserted to Node A in this case. This is because
the area increment of doing so is smaller than that of
inserting p to Node B. Even if the enlargement of
overlap area among the nodes at this level is
considered, Node A also tends to be chosen. After p
is inserted Node A, Node A becomes very long,
which may deteriorate the overlap between Node A
and the other nodes.
Node A
Node B
p
Node A Node B
p
(a) (b)
Figure 3. Slender nodes exist.
Let us to see another case shown in Figure 3 (b).
There are two MBRs shaped as line segments, A and
B. Let assume p is a new tuple to be inserted. Where
should it go? Intuitively, p should be included in
Node B. Actually, p may be inserted in Node A,
although this enlarges the overlap (between A and
B) and also leads to a long node A. This is because
the insertion algorithm of R*-tree cannot determine
which node, A or B, should be selected since both
overlap area increment and area increment of
selecting A and selecting B are 0. As a result, either
Node A or Node B is selected as default without
consideration of actual overlap. Here, we assume
that without loss of generality Node A is selected.
Same as the previous case, after p is inserted Node A,
Node A becomes very long, which may deteriorate
the overlap between Node A and the other nodes.
And, when a new point (tuple) with the same y-axis
coordinate as p is inserted again, the same process is
repeated and the new point is also inserted into Node
A.
In this way, the new-coming tuples tend to be
inserted into the existing slender nodes and the
repeated insertions of such tuples lead to the
overflow of slender nodes and the slender nodes are
split again and again. As a result,
1) many slender nodes are generated,
2) the space utilization of such nodes degrades
much and the total number of nodes in R*-tree
tends to increase,
3) very slender nodes tend to be very long (there is
a low band on the number of tuples in each leaf
node),
4) overlapping among the leaf nodes is very heavy,
which greatly destroy search performance. This
study does not aim at eliminating the existing of
slender nodes since its existing, basically speaking,
is from the distribution of tuples (as mentioned
above). The main purpose of this study is to
decrease the overlap among leaf nodes by making
the clustering pattern more proper and reasonable.
At the same time, the number of slender nodes is
also decreased and the total space utilization of
nodes also can be improved.
4 A HYBRID CLUSTERING
CRITERION
Generally speaking, the present clustering criterion
(mentioned above) of R*-tree is based on area,
including overlap area enlargement, MBR area
enlargement, and MBR area, which leads to many
slender nodes and very heavy overlap among the
leaf nodes. In this section, we explain how to deal
with the problem of slender nodes by a hybrid
clustering criterion.
Figure 3: Slender nodes exist
Our approach to this problem includes the
following two points.
(1) Modifying the area calculation.
Why a proper subtree or a leaf node can not be
found for new-coming tuples? The reason is that the
enlargements both on overlap area and on MBR area
are zero for 0-area nodes. Thus, comparison can not
be made reasonably among inserting the tuples to
the existing nodes.
In order to avoid this situation, we modified the
area calculation. That is, when the area of a
rectangle, a node MBR or the overlap region of two
node MBRs, is calculated, all the zero-sides (i.e., the
side length is zero), if exist, of this rectangle is set to
a trivial non-zero positive value (e.g., 10
-4
in our
experiments).
Let us recall the original area calculation of
rectangle R as follows.
,)(
1
=
=
d
i
i
SRArea
where S
i
is the side length of R in dimension i. d is
dimensionality of the index space.
In this study, this area calculation is modified as
follows.
A HYBRID CLUSTERING CRITERION FOR R*-TREE ON BUSINESS DATA
349
=
=
=
=
,
,0
,)(
1
otherwiseS
Svaluetrivial
S
SRaAre
i
i
i
d
i
i
where the trivial-value is set to 10
-4
in this paper.
Anyway, this trivial value must be less than the unit
in this attribute to avoid confusing. In the same time,
the trivial-value should not be too small, or the
calculation result cannot be expressed. These two
conditions are not difficult to be guaranteed in real
applications. In this way, most of un-comparable
situations caused by 0-area nodes can be avoided.
Note that, this modification only changes the
clustering pattern of tuples among the leaf nodes and
it has no effect on the correctness of the query result.
(2) Introducing a distance-criterion.
If the above area-criterion still cannot decide
which subtree or leaf node is most suitable to one
new-coming tuple, which means the area-based
clustering criterion is no longer in force, the nearest
subtree or leaf node to the new-coming tuple is
chosen.
Summarily speaking, the hybrid clustering criterion
combines the modified area-based one with a
distance-based one. The procedure is as follows.
1) For leaf level, compare the enlargements of
overlap areas using the modified calculation. If
tie then
2) Compare the enlargements of MBR areas using
the modified calculation. If tie then
3) Choose the nearest subtree (a leaf node for leaf
level).
Now, let us see how to calculate the distance from
one point to a rectangle region.
For a point p= (p
1
, , p
d
) and a rectangle R.
Let the points s= (s
1
, , s
d
) and t = (t
1
, , t
d
) be the
two vertices of the node MBR with the minimum
coordinates and maximum coordinates in each axis,
respectively. The distance from p to R, dist(p, R),
can be given by
>
<
=
=
.
,
,
1
otherwisep
tpt
sps
r
where
i
iii
iii
i
i
=
,),(
2
rpRpdist
d
ii
5 EXPERIMENTS
Using the TPC-H data (Council, 1999), we
performed various experiments to show how much
the range query performance is improved using the
hybrid clustering criterion.
Dataset and index attributes: Lineitem table of
TPC-H benchmark, which has 16 attributes of
various data types including floating, integer, date,
string, Boolean. The table used in our experiments
has 200,000 tuples. Six of the total 16 attributes are
chosen as index attributes, including SHIPDATE
(date), QUANTITY (floating), DISCOUNT
(floating), SHIPMODE (character string),
SHIPINSTRUCT (character string), and
RETURNFLAG (character string), since they are
often used as query attributes in the queries of the
benchmark.
System: the page size in our system is 4KB and
all the index structures are built based on “one node
one page”.
Queries: the query ranges of QUANTITY
(floating) and DISCOUNT (floating) both are
changed from 10% to 100%. As for the date attribute
of SHIPDATE (date), the query range is the period
of one year and it is selected randomly for each
query. As for the other 3 attributes (character string),
since their numbers of possibly different values are
only 3, 4, and 7, respectively. One value is chosen
randomly in each of the 3 attributes. Each query is
repeated 100 times for different location and the
average numbers of accessed different nodes are
presented. The average number of node accesses is a
common criterion for evaluating query performance
(H. V. Jagadish and Srivastava, 2000).
5.1 Effect of the Hybrid Clustering
Criterion on R*-tree
In order to know effect of the new clustering
criterion on R*-tree itself, the total numbers of nodes
in R*-trees with different clustering criterions and
the result is present in Table 1, where M refers to the
upper bound on the number tuples contained in each
leaf node of R*-tree.
Table 1: R*-tree with different clustering criterion.
R*-tree with
original clustering
criterion
R*-tree with
hybrid clustering
criterion
M 77 77
Height 4 4
Total
number
of nodes
4892 3783
ICEIS 2005 - DATABASES AND INFORMATION SYSTEMS INTEGRATION
350
From Table 1, we can know that the hybrid
clustering criterion make R*-tree more compact.
5.2 Effect of the Hybrid Clustering
Criterion on Query Performance
Table 2: Comparison on the number of accessed different
nodes
Query
range
R*-tree with
original
clustering criterion
R*-tree with
hybrid
clustering criterion
10% 369.91 95.12
20% 648.90 126.33
30% 603.65 131.31
40% 388.67 137.30
50% 683.29 237.27
60% 489.00 248.10
70% 708.24 231.10
80% 691.89 275.48
90% 571.10 357.62
100% 764.55 358.49
The result of comparison on the number of accessed
different nodes is included in Table 2.
From Table 2, we can know that the hybrid
clustering criterion can greatly improve the query
performance. Anyway, note that,
(1) In Table 2, the first column, query range, refers
to the side length of the query range in the two
floating attributes, i.e., QUANTITY and
DISCOUNT. The query with same size of query
range in the floating attributes is repeated 100 times
with different locations (randomly). However, this
query range is not relevant to the other index
attributes, which is explained before.
(2) According to Table 2, the number of accessed
different nodes is not always increase as the “query
range” in the first column grows. This is because
that the query ranges in the other 4 index attributes
change randomly at the same time when the query
ranges in the two floating attributes grow.
Moreover, the CPU time cost is also tested and
compared, which is presented in Table 3.
From Table 3, we can observe that the hybrid
clustering criterion also lead to a shorter CPU time,
which means that it is effective even for main-
memory-resident R*-tree, where the I/O is no long
the bottleneck of the query performance. Note that,
our OS is FreeBSD 4.9 and main memory is 128MB.
Table 3: Comparison on CPU time (ms)
Query
range
R*-tree with
original
clustering criterion
R*-tree with
hybrid
clustering criterion
10% 16.401 5.939
20% 28.180 8.117
30% 26.582 8.499
40% 17.780 9.074
50% 33.137 15.817
60% 25.103 16.769
70% 33.940 15.874
80% 34.420 19.101
90% 32.721 24.772
100% 41.671 25.751
6 CONCLUSIONS
It is important to process various types of range
queries on business data. R*-tree is one of the
successful multidimensional index structures and is
also helpful to improve query performance on
business data. In this paper, we pointed out that
many slender nodes, including many 0-area nodes,
exist if R*-tree is applied to business data, which
greatly degrade query performance. The reason that
many slender nodes occur was made clear in this
paper and a hybrid clustering criterion is introduced
to deal with the problem of slender nodes.
According to our discussion, the hybrid clustering
criterion can improve the clustering pattern of tuples
among leaf nodes, especially it can decrease the
overlap among the leaf nodes. And our approach
clearly improved query performance of R*-tree in
our experiments.
ACKNOWLEDGEMENT
This research is partially supported by Japan Society
for the Promotion of Science, Grant-in-Aid for
Scientific Research 15650017 and 16200005.
REFERENCES
C. Chung, S. Chun, J. Lee, and S. Lee (2001). Dynamic
Update Cube for Range-Sum Queries. Proc. VLDB
Intl. Conf.,
Council (1999). TPC benchmark H standard specification
(decision support)".
http://www.tpc.org/tpch/
D. Papadias, N. Mamoulis, and V. Delis (1998).
Algorithms for Querying by Spatial Structure. Proc.
VLDB Intl. Conf.
A HYBRID CLUSTERING CRITERION FOR R*-TREE ON BUSINESS DATA
351
H. Horinokuchi, and A. Makinouchi (1999). Normalized
R*-tree for Spatiotemporal Databases and Its
Performance Tests. IPSJ Journal, Vol. 40, No. 3.
H. P. Kriegel, T. Brinkhoff, and R. Schneider (1993).
Efficient Spatial Query Processing in Geographic
Database Systems.
H. V. Jagadish, N.Koudas, and D. Srivastava (2000). On
Effective Multi-Dimensional Indexing for Strings.
Proc. ACM SIGMOD Intl. Conf.
J. Han and M. Kamber (2001). Data Mining—Concepts
and Techniques. Morgan Kaufmann press.
M. Jurgens, and H.-J. Lenz (1998). The Ra*-tree: An
Improved R-tree with Materialized Data for
Supporting Range Queries on OLAP-Data. Proc.
DEXA Workshop.
N. Beckmann, and H. Kriegel (1990). The R*-tree: An
Efficient and Robust Access Method for Points and
Rectangles. Proc. ACM SIGMOD Intl. Conf.
N. Roussopoulos, S.K and F. Vincent (1995). Nearest
neighbor Query. Proc. ACM SIGMOD Intl. Conf.
N. Roussopoulos, Y. K and M. Roussopoulos (1997).
Cubetree: Organizaiton of and Bulk Incremental
Updates on the Data Cube. Proc. ACM SIGMOD Intl.
Conf.
R. Agrawal, A. Gupta, and S. Sarawagi (1997).
ModelingMultidimesnional Databases. Proc. Intl. Conf.
on Data Engineering (ICDE).
S. Hon, B. Song, and S. Lee (2001). Efficient Execution of
Range-Aggregate Queries in Data Warehouse
Environments. Proc. the 20th Intl. Conf. on
conceptual modeling.
S. Hong, B. Song and S. Lee (2001). Efficient Execution
of Range-Aggregate Queries in Data Warehouse
Environments, Proc. 20th
international Conference on
CONCEPTUAL MODELING (ER 2001).
V. Markl, F. Ramsak, and R. Bayer (1999a). Improving
OLAP Performance by Multidimensional Hierarchical
Clustering. Proc. IDEAS Intl. Synposium.
V. Markl, M. Zirkel, and R. Bayer (1999b). Processing
Operations with Restrictions in Relational Database
Management Systems without external Sorting. Proc.
Intl. Conf. on Data Engineering.
Y. Feng, A. Makinouchi, and H. Ryu (2004). Improving
Query Performance on OLAP-Data Using Enhanced
Multidimensional Indices. Proc. ICEIS Intl. Conf.
Y. Kotidis, and N. Roussopoulos (1998). An Alternative
Storage Organization for ROLAP Aggregate Views
Based on Cubetrees. Proc. ACM SIGMOD Intl. Conf.
ICEIS 2005 - DATABASES AND INFORMATION SYSTEMS INTEGRATION
352