predominantly horizontal or vertical (more or less).
This lead to the following questions - what would
happen if:
1. More diagonal line segments are indexed?
2. The line segments are longer?
We are interested to see if the above affect overlap,
overcoverage, and tree height.
3.4 Data Sets
For our follow-up evaluation, we generated collec-
tions of lines that vary in set size and slope. Each line
set contains between 100 and 100,000 lines. We use a
line length of 10 units in all cases. We chose to gener-
ate our own line sets for the follow-up evaluation be-
cause this would allow us to generate data that reflects
the best-case, average-case and worst-case scenarios,
and determine according how the mqr-tree will per-
form for each scenario.
For slope, we test both vertical and horizontal
lines, as well as diagonal lines.
We create three types of files:
• half horizontal, half vertical lines. This is consid-
ered to be the best-case scenario. Every line in
this set will have a minimum bounding rectangle
with an overcoverage of zero. Overcoverage will
still exist in higher-level minimum bounding rect-
angles that encompass the lines.
• equal distribution between horizontal,vertical,
slope of 1/2, slope of 1, slope of 2, slope of -1/2,
slope of -1 and slope of -2. This is considered the
average case. Here we have both lines that can
be contained with a minimum bounding rectangle
with zero overcoverage, lines that will achieve the
worst overcoverage when contained with a mini-
mum bounding rectangle.
• equal distribution between slope of 1/2, slope of
1, slope of 2, slope of -1/2, slope of -1 and slope
of -2. This is considered to be the worst case.
Here, all minimum bounding rectangles that con-
tain lines are all of non-zero overlap.
Using these data sets, we run the same tests as
above for the road and railroad data.
3.5 Evaluation on Vertical and
Horizontal Data Sets
Our first evaluation was to compare the mqr-tree with
the R-tree on indexing the horizontal and vertical line
sets. This is expected to produce the best results since
at the leaf level, the overcoverage of minimum bound-
ing rectangles will be zero and the overlap will be
very low (effectively, the only overlap are the inter-
section points between two lines).
Table 2 presents the results for the horizontal and
vertical line sets. Again, we find the most significant
result to be in the improvement in overlap. The mqr-
tree achieves lower overlap in all cases. Although
the improvement amounts are not as high as with the
road and railroad data, they are still significant, es-
pecially in the data sets with the higher number of
line segments. Overall, we find in the smaller sets an
improvement of approximately 45-50% lower overlap
over the R-tree, while in the larger sets the improve-
ment is as high as 92%. We also find the same trends
for coverage and overcoverage, with improvements
that increase from 3% to 58% for coverage and from
9% to 73% for overcoverage. The height, although
still high in the mqr-tree, is comparable to those ob-
tained in the initial road and railroad data tests.
3.6 Evaluation on Uniform Data Sets
Our next evaluation is to compare the mqr-tree and R-
tree using the uniform data sets, where in each there
are an equal number of vertical and horizontal lines,
and lines of varying slope. Table 3 presents this re-
sults. We find results that are very similar to those
for the horizontal and vertical line sets. We find that
the mqr-tree achieves an improvement in overlap that
ranges from 43% for the smaller data sets to 90% for
the largest one. Similarly, we find improvements in
coverage that fall between 3% and 57%, and for over-
coverage that fall between 14% and 77%. This is
very reassuring because it appears that diagonal lines
(which in this case, make up 3/4ths of each data set)
do not significantly affect the performance criteria.
3.7 Evaluation on Sloped Lines Only
Our last test is to compare the performance of the
mqr-tree and R-tree on data sets containing only
sloped lines, and no horizontal or vertical lines. This
is considered to be our worse-case scenario because at
the leaf level, all minimum bounding rectangles will
have significant overcoverage. However, we still find
in Table 4 that for the most part the performance im-
provements of the mqr-tree over the R-tree are as sig-
nificant as those found in the other evaluations. The
only improvement that is not as significant is in the
overlap decrease for the smallest test set. However,
the mqr-tree still achieves lower overlap in this case.
ICEIS 2011 - 13th International Conference on Enterprise Information Systems
318