Formalising the phenomenon just observed, we say
that two filters are in conflict if they use the same
facts. Since in (Ohler and Terwelp, 2015) it has been
shown that the runtime costs of the network depicted
in Figure 5 are always higher than those of the other
two networks, we will not consider such networks
here. The decision which of the two remaining net-
works performs better depends on the data to be ex-
pected.
Furthermore, there may be situations where node
sharing is not beneficial. For example, two rules shar-
ing a filter that all facts pass should not share that filter
if they have other (more selective) filters that could be
applied to the data first. Sharing the filter would re-
quire to apply that filter first resulting in a high main-
tenance cost for the corresponding node. Applying
the filter last could lead to very low maintenance costs
as very few facts reach the node such that even the
twofold costs are lower than the costs in the sharing
situation. Detecting these situations requires informa-
tion about, e. g., filter selectivities, but can continue to
improve the quality of the resulting network.
Finally, integrating the degree of freedom intro-
duced by the equivalence classes as mentioned in Sec-
tion 4 into the network construction is a further aspect
considered here.
6 STATE OF THE ART
There are several DN construction algorithms creat-
ing different types of networks such as Rete (Forgy,
1982), TREAT (Miranker, 1987), and Gator (Hanson
et al., 2002). Yet, they all consider the rules one after
another so that the degree of sharing network parts is
governed mainly by the order in which the rules are
considered and the order of the filters within the rule
conditions. Furthermore, the optimisation potential
introduced by the equivalence classes is neglected and
all variables are assumed to be bound or are bound in
a preliminary consideration.
An approach for query optimisation for in-
memory DRC database systems is presented in
(Whang and Krishnamurthy, 1990). They exploit the
concept of equivalence classes, but only consider left-
deep join plans and look at each query on its own
without evaluating node-sharing.
In (Aouiche et al., 2006), the authors apply a data-
mining technique to decide which views to materi-
alise during the processing of a set of queries in a rela-
tional database system. Here, several queries are con-
sidered together and grouped by a similarity heuristic.
Columns relevant for materialisation are identified by
a cost function and re-used as much as possible to pre-
vent repeated evaluations. In doing so, the filters to be
applied are reduced to the ones relevant to all queries
involved. Thereby, they do not identify the problem
of conflicts as such and decisions are made based on
columns to be materialised instead of filters as done
here.
7 APPROACH
Previously, we referred to different types of facts,
which we will now call templates. A template resem-
bles a class and its fields are called slots. All facts are
instances of templates. More specifically, we will use
the term fact binding to be able to distinguish between
several facts of the same template. Every fact in the
resulting fact tuple of a rule condition corresponds to
a fact binding and vice versa. Equivalence classes
as introduced in Section 4 contain fact bindings, slot
bindings (bindings to a slot of a fact binding), con-
stants, and functional expressions (i. e. ?x+?y). A fil-
ter comprises a predicate (the test to be executed) and
the parameters to be used. We distinguish between
the following two types of filters:
Explicit Filter. An explicit filter is a filter using
equivalence classes as arguments.
Implicit Filter. An implicit filter tests the equality of
exactly two elements of the corresponding equiv-
alence class.
For two filters f and g we call c(ϑ( f ), ϑ
0
(g)) the
conflict index set w. r. t. the equivalence class restric-
tions ϑ and ϑ
0
(see below). It contains pairs of indices
with the first index corresponding to a parameter posi-
tion of the filter f , the second index meaning the same
for g. Only those index pairs are contained, for which
there is a non-empty intersection of the fact bindings
in the restricted equivalence classes corresponding to
the parameter determined by the indices.
Two filters f and g are in conflict w. r. t. ϑ and
ϑ
0
iff c(ϑ( f ), ϑ
0
(g)) 6=
/
0. Given the filters a, b, f , g,
we write (a, f ) ∼
ϑ
c
(b, g) instead of c(ϑ(a), ϑ( f )) =
c(ϑ(b), ϑ(g)). A block consists of the following four
components:
Equivalence Class Restriction. An equivalence
class restriction (denoted ϑ) of a block is a func-
tion mapping every equivalence class occurring in
the block onto the maximal subset still guaranteed
by the implicit tests of the block.
Filter Partition. A filter partition is a partition of the
explicit filters of a block with the following prop-
erty: Every set of the partition contains filters of
only one predicate and for every pair of sets in the
partition it holds that every pair of elements of the
WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies
260