4.1 Join Queries
This section provides an overview of the setup, test
cases, and results of our join queries simulation.
In order to make the comparison between the
two methods as clear and fair as possible, we began
our simulation setup by identifying points of
similarity in the Ren et al. and visualization
algorithms. We developed an efficient method for
modelling a small cache and reading in user queries
and implemented the two query processing
algorithms with two programs based on this
modelling method. By using the same data structure
in both programs to contain our simulated cache,
user queries, temporary remainder query (after
processing each individual semantic region), and so
on. We were able to produce two streamlined
programs that worked in much the same way, except
for the specific methods of query trimming. This
approach allowed us to test the efficiency of the
query processing trimming methods directly. Both
programs were based on the use of an original object
class,
RelationPredicate(), which contained the
table name, primary keys, attributes, and compare
predicates for each query. Each program was written
in Java and ran on a Pentium processor running
Windows Vista with 2 GB of RAM.
4.1.1 Test Cases
Following similar simulations conducted in (Ren, et
al., 2003), (Guo, et al., 1996), (Hao, et al., 2005) and
(Li, et al., 2008), we chose to compare the two
programs on the basis of execution time. We
modelled growing query complexity by gradually
increasing the number of semantic regions to be
processed. Since this method of measuring time
sometimes produces wildly varying results because
of other operations running on the system, we ran
each simulation 15 times and computed the mean of
the middle 10 results, allowing us to discard
obviously exotic times. We selected a variety of test
cases (Table 1 lists the test queries), ranging from
full containment (no remainder query generated) to
no intersection between the user query and the
semantic region (no probe query generated).
4.1.2 Results for Join Queries
Over the 10 cases that we tested, a consistent pattern
of differing efficiencies between the Ren et al. and
visualization methods clearly emerged. The
visualization method’s execution time increases
linearly, as we predicted, while the Ren et al.
method’s execution time increases exponentially in
some cases.
Table 1: Test queries for Case 1 through 5.
Q1
Select t1.x, t1.y, t2.x, t2.y from t1, t2 where t1.x>=3&
t1.x<=8& t1.y>=8&t1.y<=12&t2.x>=4&t2.x<=11&t2.y>=4
& t2.y<=10;
Q2
Select t1.x, t1.y, t2.x, t2.y from t1, t2 where t1.id=t2.id&
t1.x>=0&t1.x<=2&t1.y>=0&t1.y<=2&t2.x>=0&t2.x<=2&
t2.y>=0& t2.y<=2;
Q3
Select t1.x, t1.y, t2.x, t2.y from t1, t2 where t1.id>=t2.id&
t1.x>=5&t1.x<=8&t1.y>=6&t1.y<=8&t2.x>=6&t2.x<=7&
t2.y>=5&t2.y<=7;
Q4
Select t1.x, t1.y, t2.x, t2.y from t1, t2 where t1.id>=t2.id&
t1.x>=1&t1.x<=2&t1.y>=6&t1.y<=8&t2.x>=2&t2.x<=15&
t2.y>=2&t2.y<=7;
Q5
Select t1.x, t1.y, t2.x, t2.y from t1, t2 where t1.id=t2.id&
t1.x>=0&t1.x<=10&t1.y>=0&t1.y<=20&t2.x>=5&t2.x<=10
&t2.y>=2&t2.y<=7;
Q6
Select t1.x, t1.y, t2.x, t2.y from t1, t2 where t1.id=t2.id&
1.x>=3& t1.x<=11&t1.y>=9&t1.y<=14&t2.x>=8&t2.x<=13
& t2.y>=9 & t2.y<=13
Q7
Select t1.x, t1.y, t2.x, t2.y from t1, t2 where t1=t2.id&
t1.x>=3&t1.x<=8&t1.y>=8&t1.y<=12&t2.x>=4&t2.x<=11
& t2.y>=4&t2.y<=10;
Q8
Select t1.x,t1.y,t2.x,t2.y from t1,t2 where t1.id=t2.id&
t1.x>=0& t1.x <=2&t1.y>=0&t1.y<=2& t2.x>=0&t2.x<=2&
t2.y>=0&t2.y<=2;
Q9
Select t1.x,t1.y,t2.x,t2.y from t1,t2 where t1.id>=t2.id&
t1.x>=5 &t1.x<=8&t1.y>=6&t1.y<=8&t2.x>=6&t2.x<=7&
t2.y>=5&t2.y<=7;
Case 1: No Intersection
Figure 2 models the performance of the two methods
for two test queries that represent the case where
there is no probe query generated (Q
PQ
= Ø). For
both examples (Query2 and Query4), the
visualization method is clearly more efficient than
the Ren et al.’s method as the number of semantic
regions increases (Note: Query 2 and Query 4 of
visualization have completely overlapped in the
Figure 2).
Figure 2: No Containment.
Case 2: Full Containment
Figure 3 models performance for our other base
case, where there is no remainder query because the
ICEIS 2011 - 13th International Conference on Enterprise Information Systems
172