Table 2: Averages for Varying Distribution.
Distribution #Nodes Height Coverage Overcov Overlap #Seeks/Ins #Splits/Ins
Uniform 654.06 12.31 1,233,595.22 138,768.90 129,630.80 30.74 1.12
Exponential 652.63 12.31 638,430.98 64,157.58 67,130.08 30.71 1.10
jects inserted on the tree height, and the average num-
ber of disk accesses required for inserting an object.
Initially, the tree grows in height quickly but growth
slows significantly as more objects are inserted. The
same occurs for the number of disk accesses.
Figure 6b shows the effect of the number of ob-
jects inserted on the coverage, overlap and overcover-
age. Results show that the coverage increases linearly
as the number of objects increase. In addition, the
rate of increase in overlap and overcoverage is signif-
icantly lower as the number of objects increase. Cov-
erage includes the object coverage, while overlap and
overcoverage are only calculated for non-leaf nodes.
One reason for the lower growth in overlap and over-
coverage is the ability of the 2DR-tree to “cluster”
objects located close together as the number of ob-
jects increase, which reduce both the overlap and the
wasted space in non-leaf approximations.
Table 2 shows the averages for each run when
varying the distribution of the data set. The results
show a significant difference in coverage, overcover-
age, and overlap. The surprising result is that when
indexing exponentially distributed data, the 2DR-
tree achieves significantly, almost 50%, lower cover-
age and overlap, and 54% lower overcoverage. The
height, number of nodes, and space utilization are not
a factor in this because they are not significantly dif-
ferent between the data distributions. After many in-
sertions, “chains” that consist of many non-leaf nodes
that lead to one node with few objects - possibly one -
start to appear. An advantage to chains is that outliers
are separated from a cluster of objects, which reduces
the coverage, overcoverage, and overlap of MBRs.
6 CONCLUSION
This paper presents work on the 2DR-tree, which pre-
serves spatial relationships between all objects by us-
ing nodes that are the same dimensionality as the ob-
ject set. This structure supports non-linear search
strategies. We present the insertion strategy and some
preliminary evaluation results. The results show that
the 2DR-tree is ideal for larger objects sets with re-
spect to tree height. The average number of disk ac-
cesses and split per insert are reasonable. In addi-
tion, it is ideal for a dynamic skewed data set, which
achieves lower coverage, overcoverage, and overlap
than a dynamic, uniformly distributed data set.
Some research directions include: 1) a perfor-
mance evaluation versus other proposed SAMs; 2)
improving the average space utilization, which is very
low; 3) developing an algorithm for bottom-up tree
construction applicable to static data sets; 4) extend-
ing the 2DR-tree for three dimensions.
REFERENCES
Beckmann, N., Kriegel, H.-P., Schneider, R., and Seeger,
B. (1990). The R
∗
-tree: an efficient and robust access
method for points and rectangles. In Proceedings of
the ACM SIGMOD International Conference on Man-
agement of Data, pages 322–31.
Berchtold, S., Keim, D., and Kriegel, H.-P. (1996). The X-
tree: An index structure for high-dimensional data. In
Proceedings of the 22nd International Conference on
Very Large Data Bases, pages 28–39.
Comer, D. (1979). The ubiquitous B-tree. ACM Computing
Surveys, 11:121–37.
Gaede, V. and G
¨
unther, O. (1998). Multidimensional access
methods. ACM Computing Surveys, 30:170–231.
Guttman, A. (1984). R-trees: a dynamic index structure
for spatial searching. In Proceedings of the ACM
SIGMOD International Conference on Management
of Data, pages 47–57.
Kamel, I. and Faloutsos, C. (1994). Hilbert R-tree: an im-
proved r-tree using fractals. In Proceedings of the 20th
International Conference on Very Large Data Bases,
pages 500–9.
Koudas, N. (2000). Indexing support for spatial joins. Data
and Knowledge Engineering, 34:99–124.
Orenstein, J. and Merrett, T. (1984). A class of data struc-
tures for associative searching. In Proceedings of the
Third ACM SIGACT-SIGMOD Symposium on Princi-
ples of Database Systems, pages 181–90.
Papadias, D., Egenhofer, M., and Sharma, J. (1996). Hier-
archical reasoning about direction relations. In Pro-
ceedings of the 4th ACM-GIS.
Samet, H. (1990). The design and analysis of spatial data
structures. Addison-Wesley.
Sellis, T., Roussopoulos, N., and Faloutsos, C. (1987). The
R
+
-tree: a dynamic index for multi-dimensional ob-
jects. In Proceedings of the 13th International Con-
ference on Very Large Data Bases.
Shekhar, S. and Chawla, S. (2003). Spatial databases: a
tour. Prentice Hall.
ICEIS 2007 - International Conference on Enterprise Information Systems
300