Rating of Discrimination Networks for Rule-based Systems

Fabian Ohler, Kai Schwarz, Karl-Heinz Krempels and Christoph Terwelp

Informatik 5, Information Systems and Databases, RWTH Aachen University, 52072 Aachen, Germany

Keywords:

Rule-based System, Discrimination Network, Rating.

Abstract:

The amount of information stored in a digital form grows on a daily basis but is mostly only understandable

by humans, not machines. A way to enable machines to understand this information is using a representation

suitable for further processing, e. g. frames for fact declaration in a Rule-based System. Rule-based Systems

heavily rely on Discrimination Networks to store intermediate results to speed up the rule processing cycles.

As these Discrimination Networks have a very complex structure it is important to be able to optimize them or

to choose one out of many Discrimination Networks based on its structural efﬁciency. Therefore, we present

a rating mechanism for Discrimination Networks structures and their efﬁciencies. The ratings are based on

a normalised representation of Discrimination Network structures and change frequency estimations of the

facts in the working memory and are used for comparison of different Discrimination Networks regarding

processing costs.

1 INTRODUCTION

This paper presents a rating function for Discrimina-

tion Networks (DNs) in Rule-based Systems (RBSs).

It elaborates the need for information about the fact

base of the RBS and introduces a normalised form

for DNs. This normalised form allows for a rating of

the DN disregarding implementation details. Using

these ratings it is possible to evaluate the efﬁciency of

DNs, measure optimization attempts and improve the

overall performance of the RBS without the need for

benchmarking.

The paper is organized as follows: 2 explains why

a new rating approach is worthwhile. 3 introduces

existing rating approaches for DNs. The new rating

algorithm is developed in 4. Finally, conclusions are

given in 5.

2 MOTIVATION

The different existing construction algo-

rithms (e.g. TREAT (Miranker, 1987), Gator

(Hanson and Hasan, 1993)) are suited for different

use cases. Therefore the resulting DNs yield varying

runtime and memory costs while maintaining the

same semantics. In most cases it is desirable to

minimize runtime and memory usage as much as

possible. To achieve this goal a way to identify

the DN which best suits the given environment is

necessary.

As DNs are complex and large an automated com-

parison of DNs is required. The idea is to rate a DN’s

runtime and memory usage in order to compare differ-

ent DNs. Beneﬁts include enabling to ﬁnd the ‘best’

DN for a given environment as well as measuring the

pay-off of optimization attempts.

The composition of the facts in the working mem-

ory contributelargely to the DNs performance regard-

ing runtime and memory usage. An approach for rat-

ing will have to take this into account and rate a DN

in the context of the composition of the given work-

ing memory. Therefore a rating only gives informa-

tion about the performance of a DN with a working

memory that satisﬁes the statistical information used

to rate. If a different working memory is used with

the DN it is possible that the rating no longer gives an

accurate estimate of the performance.

3 STATE OF THE ART

There already are some approaches to compare DNs,

mostly used to show that a new construction algo-

rithm is an improvement.

Benchmark

TREAT (Miranker, 1987) uses benchmarks like

‘monkeys and bananas’ (Brownston et al., 1985)

Ohler F., Schwarz K., Krempels K. and Terwelp C..

Rating of Discrimination Networks for Rule-based Systems.

DOI: 10.5220/0004634900320042

In Proceedings of the 2nd International Conference on Data Technologies and Applications (DATA-2013), pages 32-42

ISBN: 978-989-8565-67-9

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

and ‘waltz’ (Winston, 1992) for an OPS5

(Forgy, 1981) rule based system. These bench-

marks test a construction algorithm by letting it

construct a DN and then benchmarking it. This

approach often leads to construction algorithms

being optimized for the existing benchmarks.

But because the performance depends on the

composition of the working set and the possible

compositions are nearly endless, a construction

algorithm can yield very good results in a bench-

mark and still be outperformed when constructing

a DN in an actual application.

Another possibility is to actually benchmark the

possible DNs with the real working memory and

take the best one. While being very accurate these

benchmarks need a lot of time and resources.

Therefore, benchmarks are of limited use when

trying to rate a DN efﬁciently.

Cost Function

Gator (Hanson and Hasan, 1993) (Hanson, 1993)

uses a thorough cost function for single rules us-

ing statistical information about the composition

of the working memory as well as the selectivity

of ﬁlters to predict the runtime and memory us-

age. It even considers the size of facts in memory

and how many memory pages have to be touched

to apply changes to the working memory.

While an efﬁcient approach the cost function only

rates a single rule, not the whole DN. It ignores

the beneﬁts of shared nodes on the one hand. On

the other hand negated condition elements can not

be rated (Hanson, 1993). Additionally the cost

function was developed having implementation

of databases in mind, so rating a DN not imple-

mented on a database can lead to deviating results.

4 APPROACH

The considerations taken in the last section suggest

providing a universal cost function for rating DNs.

Such a cost function is introduced in this section. To

rate the DN, at ﬁrst its structure is normalised in a

very general way representing optimizations concern-

ing the network’s structure as detailed as possible. For

each normalised component cost functions are given.

The costs of all network components can be used to

rate any DN structure.

4.1 Normalisation

The construction algorithms introduced differ not

only in the resulting network structure but also in in-

ternal mechanisms regarding the RBS. To rate DNs

in a normalised manner the following simplifying as-

sumptions are made:

4.1.1 Alpha Nodes

An alpha node ﬁlters for one attribute of a fact only.

Alpha nodes ﬁltering for more than one attribute are

split up and represented as a chain of alpha nodes as

they are semantically equivalent. If an alpha node

is connected to other alpha nodes only and does not

have a negated input (see 4.1.3) the internal memory

is omitted (virtual alpha node). This approach pre-

vents memory overhead by splitting up alpha nodes.

Facts in alpha nodes are stored for optimal perfor-

mance regarding joining and selection.

4.1.2 Beta Nodes

For every input a beta node has a list containing the

other nodes connected to its inputs in the order they

are supposed to be joined to keep the intermediate re-

sults as small as possible. Variant I beta nodes include

the negated inputs in their lists, variant II beta nodes

don’t (variants explained in 4.1.3). This lists can be

created based on estimates regarding fact correlation

or in a simpliﬁed way by ordering the nodes accord-

ing to estimated size.

The storage of the fact tuples in a beta node is op-

timised for the beta nodes connected to its output to

allow for efﬁcient selection on relevant facts or their

slots similar to alpha nodes.

To delete facts or fact tuples in beta nodes the

following common optimisation is used: If a fact is

deleted in one of the nodes connected to an input, the

fact tuples resulting from that fact are deleted and the

fact is propagated to successor nodes as it entered the

node, meaning it is not joined. This optimisation is

not used for nodes connected negatively.

4.1.3 Negated Condition Elements

A lot of RBSs allow elements of a rule’s condition to

be negated (negated condition elements). There are

several ways to represent these in DNs. In the ob-

vious variant a beta node combines a positive set of

facts with a negated set of facts by propagating the

set of facts in the positive set that have no counter-

part (regarding the join) in the negated set. Both in-

put nodes continue to be usual (positive) nodes, but

one of the edges is marked as negated. Because no

joined fact tuples are produced and passed on, such

nodes can be implemented in the alpha network too:

Two alpha nodes are connected by a negated edge.

The ‘one input rule’ in the alpha network is diluted,

RatingofDiscriminationNetworksforRule-basedSystems

alpha nodes have exactly one positive input and an ar-

bitrary number of negated inputs. Negated condition

elements can thus be realized in the alpha and beta

network.

If a negated edge connects the output of x with

an input of y, we call x a negatively connected node

(NCN) of y.

The possible implementations of negations are

now discussed brieﬂy. As above we start in the beta

network.

Variant I in the Beta Network

The already mentioned method of implementation

in a beta node shall be explained in more detail. A

beta node with positively connected nodes (PCNs)

and NCNs joins with the PCNs and ﬁlters using

the NCNs. If a new fact tuple reaches the node

from a PCN it is joined with the other PCNs. The

result is stored in the node’s memory only if no

matching fact tuples can be found in any of the

NCNs. Deleting a fact tuple in a PCN deletes the

fact tuples resulting from that fact. A new fact tu-

ples reaching the node on a NCN is joined with

the current result set and matching fact tuples are

deleted. If a fact tuple is deleted in a NCN, match-

ing fact tuples are searched for in the PCNs or

their joins. Fact tuples found are ﬁltered by NCNs

and added to the result set.

Variant II in the Beta Network

The second method tries to reduce the compara-

tively high effort of the joins needed in a variant I

beta node if a fact tuple is deleted in a NCN. For

that purpose all join results of the PCNs are saved

instead of only saving a ﬁltered set. To every fact

tuple a counting ﬁeld is added for every NCN.

The counting ﬁeld holds the number of matching

fact tuples in the corresponding NCN. Only the

fact tuples with zeros in their counting ﬁelds are

relevant to successor nodes. If a new fact tuple

reaches the node from a PCN, it is joined with the

other PCNs and added to the result set. The count-

ing ﬁelds are ﬁlled with the sizes of the particular

joins with every NCN. Deleting a fact tuple in

a PCN deletes the fact tuples resulting from that

fact. A new fact tuple reaching the node from a

NCN is joined with the result set and the counting

ﬁelds of the matching fact tuples are increased by

one. One of the counting ﬁelds raising from zero

triggers a propagation of the corresponding fact

tuple to successor nodes as deleted. If a fact tu-

ple is deleted in a NCN, it is joined with the re-

sult set and the counting ﬁelds of the matching

fact tuples are decreased by one. A counting ﬁeld

dropping to zero has the corresponding fact tu-

ple propagated as new to successor nodes. The

rule based system Jamocha (Jamocha, 2006) uses

nodes of this kind.

Negation in the Alpha Network

As mentioned above, negation can also be imple-

mented in the alpha network. The implementation

is similar to the second variant in the beta network

adding counting ﬁelds for the NCNs. If a new fact

reaches the node, it is joined with the NCNs and

saves the join size in the counting ﬁelds. Only if

no matching facts are found, the fact is passed on

to successor nodes. Deleting a fact in the node

propagates the fact to successor nodes only if the

counting ﬁeld is zero. A new fact reaching a

NCN is joined with the node and the counting

ﬁelds of the fact tuples in the join result are in-

creased by one. One of the counting ﬁelds rais-

ing from zero triggers a propagation of the cor-

responding fact to successor nodes as deleted. If

a fact in a NCN is deleted, it is joined with the

node and the counting ﬁelds of the fact tuples in

the join result are decreased by one. A count-

ing ﬁeld dropping to zero has the correspond-

ing fact propagated as new to successor nodes.

Gator (Hanson and Hasan, 1993) (Hanson, 1993)

uses negation in this way.

The methods mentioned above are certainly not

the only ways to implement negation, but one possi-

bility to negate in the alpha network and two funda-

mentally different variants to negate in the beta net-

work with dissimilar pros and cons have been pre-

sented. It has been clariﬁed that a negated condition

element does not change the nodes themselves, but

how facts from the corresponding nodes are handled.

The possibility to negate in the alpha network already

offers potential for network optimization.

On the other hand the task to estimate the costs

for a negation in a network has become more difﬁ-

cult than just estimating the costs of nodes concerning

joins. Here the costs arising from a negation are to be

added to the costs of the adjacent nodes. However to

estimate the cost in a proper way, both mechanisms

have to be rateable. If it is not known how a network

implements negations, it is rated as if it uses variant I.

According to the considerations above, the con-

straints for rateable networks can be summarised as

follows: Each rule condition has at least one non-

negated condition element and different negated con-

dition elements don’t have shared bound variables.

Alpha nodes with negated edges always have an in-

ternal memory.

4.1.4 Processing Tokens

Upon creation of a new fact f (by a rule for example)

DATA2013-2ndInternationalConferenceonDataManagementTechnologiesandApplications

it is encapsulated into a + token. This token is injected

into the root node to be processed by the DN. Analog

a deleted fact is injected using a - token. The fol-

lowing section explains the steps necessary to process

these facts when a token reaches a node. In the fol-

lowing Z always has the semantic of the correspond-

ing beta node.

+ token reaches alpha node If the fact f

does not pass the test set it is discarded. Else the

node stores the fact inside its memory (if appli-

cable) and passes the + token on to the following

nodes.

f is joined with the memory of any alpha NCN

and stores the size of the joins in the matching

counting ﬁelds. Only if all ﬁelds contain zeros,

i.e. there are no facts matching f in alpha NCN

the + token is passed on to the following nodes.

For follow-up beta nodes the fact is encapsulated

into a +Temporary Result (TR) token ﬁrst.

As a NCN the node joins f with the memories

of the connected nodes and updates their counting

ﬁelds. The corresponding counting ﬁelds are in-

cremented by one if the join yields a result. If the

increment changes the value from zero a - token

or -TR token for f has to be created and passed

on from the connected node.

- token reaches alpha node. If the fact f

does not pass the test set it is discarded. Else the

node deletes the fact from its memory (if appli-

cable) and passes the - token on to the following

nodes. The - token is only passed on to successor

nodes if all (possible) counting ﬁelds are zero.

For follow-up beta nodes the fact is ﬁrst encapsu-

lated into a -TR token.

As a NCN the node joins f with the memories

of the connected nodes and updates their count-

ing ﬁelds. The corresponding counting ﬁelds are

decremented by one if the join result yields a re-

sult. If the decrement changes the value to zero a

+ token or +TR token for f has to be created and

passed on from the connected node.

+TR token reaches beta node. When a

+TR token reaches a node from a NCN one has to

distinguish: Variant I beta nodes join the fact with

their result sets, delete the resulting tuples from

its result set and propagate this to the follow-up

nodes. Variant II beta nodes join the fact with

their result sets and increment the counting ﬁelds

of the resulting tuples by one. If a counting ﬁelds

is incremented from zero, a -TR token is passed

on to the following nodes.

When a +TR token reaches a node from a PCN

one has to distinguish again: Variant I beta nodes

(or beta nodes without NCNs) have sorted lists

of all their inputs excluding the input the +TR

token arrived from. The nodes then join the token

with the inputs iterating the list (TRZinput) or

subtract the join for NCNs (TR-TRXinput). In

each iteration the result is stored in TR. The ﬁnal

result is stored in the internal memory and passed

on to the following nodes.

Variant II beta nodes have lists of their PCNs

excluding the input the TR token arrived from.

The node then joins the token with the inputs

iterating the list (TRZinput). In each iteration the

result is stored in TR and the ﬁnal result is stored

in the internal memory. Additionally the size of

the join of each result with each NCN is stored

in the appropriate counting ﬁeld. Only results

whose counting ﬁelds are zero are passed on to

the follow-up nodes.

-TR token reaches beta node. A -TR to-

ken reaching the node from a NCN is handled

similar to a +TR token from a PCN. Additionally

one has to distinguish: Variant I beta nodes have

sorted lists of all inputs excluding the inputs the -

TR token arrived from. The node then joins the to-

ken with the inputs iterating the list (TRZinput) or

subtracts the join for NCNs (TR-TRXinput). TR

is then ﬁltered by the input the token originated

from (TR-TRXinput). The result is then stored in

the internal memory and passed on to following

nodes.

4.2 Rating

The DNs shall be rated using a cost analysis. In this

context costs are considered to be memory usage and

runtime. The normalisation allows a uniform rating.

A detailed rating is done based on statistical values

about ﬁlters, joins and relations. If the facts are stored

in a relational database, a lot of the values used are

available for internal use already.

The better the statistical values match the real val-

ues, the more precise the DN can be rated. But even

without these values, statements on DNs can be made.

DNs can be compared e.g. using general mean values

for the missing values or by rating all conceivable sce-

narios.

For the sake of brevity nodes are tagged corre-

sponding to 1.

Hereafter is explained, which statistical values are

used to rate a DN and how the different node types

can be rated.

4.2.1 Statistical Values the Rating Bases On

The naming partially follows Gator (Hanson and

RatingofDiscriminationNetworksforRule-basedSystems

Table 1: Node tags.

is an alpha node

−α

is an alpha node in a negated context

is a beta node

−β

is a beta node in a negated context

is an alpha or beta node in a

non-negated context

−

is an alpha or beta node in a

negated context

Hasan, 1993), (Hanson, 1993).

|U(x

)| U(x

) is the set of facts contained in node x

including the ones ﬁltered out by NCNs. |U(x

is the size of this set and is a statistical input value

for alpha nodes. X is the set of facts belonging to

node x as it reaches successor nodes, thus already

ﬁltered. Estimates for |X| will be given.

JSF(x,y). Estimated size of the join in relation to the

joined nodes’ fact set sizes, so this value describes

the selectivity of the join (Join Selectivity Factor).

It is an expected value for:

|X Z Y|

|X| · |Y|

. (1)

Sel(x

). The selectivity of a node is the ratio between

accepted and rejected facts.

). The frequency of + tokens reaching and

changing the node x. F

’(x

) is the frequency of +

tokens reaching the node x which lead to a prop-

agation to follow-up nodes – this value is calcu-

lated as necessary.

). The frequency of - tokens reaching and

changing the node x. F

) is the frequency of -

tokens reaching the node x which lead to a prop-

agation to follow-up nodes – this value is calcu-

lated as necessary.

T(x). Tuple Size (T) in node x. For the sake of sim-

plicity facts in alpha nodes are treated to be of

the same size (T(a

)= 1). A way to calculate this

value is given for beta nodes.

TPP(x). The number of Tuples per Page (TPP) for

facts in node x. Caused by the simpliﬁcation of

fact sizes the TPP is inversely proportional to the

tuple size T.

Remark Concerning the Join Selectivity Factor.

Let x and y be nodes with non-empty internal mem-

ory, thus |X| > 0 and |Y| > 0. Adding or removing

a fact from X emerges the node x’ with the result set

X’. The size of the join X Z Y is expected to change

by some µ ∈ R

. An estimated value for this can be

calculated using the JSF(x,y).

(2a)JSF(x,y) =

|X’ Z Y|

|X’| · |Y|

(2b)=

|X Z Y| ± µ

(|X| ± 1) · |Y|

(2c)=

JSF(x,y) ±

|X|·|Y|

1 ±

|X|

(2d)⇔

JSF(x,y)

1 ±

|X|

(1 ±

|X|

− 1) =

±µ

(|X| ± 1) |Y|

(2e)⇔

JSF(x,y)

|X| ± 1

(|X| ± 1) |Y|

(2f)⇔ JSF(x, y) · |y| = µ

Until now, the Join Selectivity Factor (JSF) has been

considered for node pairs only. For a precise size

estimation of joins with several inputs JSFs inﬂu-

enced by conditional probabilities of intermediate re-

sults and further inputs would be necessary. To keep

the amount of input values small, the calculations are

simpliﬁed in the following way: The JSF for two in-

puts is a good mean for the JSF between these two

inputs as well as for the JSF between each of these

inputs and all intermediate results arising during the

processing of the join list in the node.

Preliminary Considerations about Join Costs.

At this point the problem of join cost estimation

is discussed. Without considering implementation

details it is hard to determine the right criteria for

the estimation. Thus primarily the costs of a join in

a network using an optimal implementation are to

be examined. As mentioned in the normalized form

description, nodes are optimized for successor nodes.

Hence for the costs of the join L Z R mainly the size

of the result, S ≔ |L| · JSF(l,r) · |R| , is to be consid-

ered. The size of this set is a main characteristic of

the join and can not be changed by optimisations in

the join implementation. To factor the size of facts in

this set into the resulting costs accesses to memory

pages are chosen as basis for the rating.

This approach can also be found in Gator

(Hanson, 1993). There an algorithm applied by

Cárdenas (Cárdenas, 1975) is used to calculate

the amount of memory pages touched for datasets

(uniformly) distributed across m memory pages when

selecting k datasets as follows:

C(m, k) = m(1− (1−

/m)

) (3)

A derivation can be found in Yao (Yao, 1977), where

the formula is considered to be faulty as it reﬂects a

DATA2013-2ndInternationalConferenceonDataManagementTechnologiesandApplications

combination with repetition instead of a combination

without repetition, but Yao’s corrected version only

allows integer numbers of touched memory pages.

The error described is insigniﬁcant for large TPP val-

ues and is to be put up with to allow for non-integer

estimated numbers of touched memory pages without

interpolation. According to Yao alternative formulas

are by far more complicated, but can be used to

replace the Cárdenas formula as needed to determine

more accurate results.

The number of resulting datasets is k ≔ S.

Most nodes are connected to multiple successor

nodes and their facts are usually joined with different

slots, so datasets can not be stored optimized for

selection of a particular slot. Thus to assume a

uniform distribution of the datasets will presumably

cause only a minor deviation for most of the nodes.

Nodes that store their data sets optimized in this

way cause less runtime costs and can be considered

separately. In the interest of clarity this is not done

here.

Using the simpliﬁed knowledge about the number

of facts / fact tuples of a node r ﬁtting on a memory

page, the number of memory pages needed for r can

be determined:

(4)m(r) ≔ ⌈|R|/TPP(r)⌉

(5)m(U(R)) ≔ C



|U(R)|

TPP(r)



,m(r)



The costs to join a set of p facts / fact tuples from node

l on node r will be given for subsequent calculations

by the following formula:

l,r

(p) ≔ C(m(r), p · JSF(l,r) · |R|) (6)

4.2.2 Root Node

This node is necessary and therefore present in every

network. Its outer structure is unique. Thus its need-

less to factor this node into the rating.

However statistical values about the set of facts in the

root node, the number of facts per memory page and

the frequency of +/- tokens reaching the root node and

changing it, are important to derive values for succes-

sor nodes.

4.2.3 Alpha Node

An alpha node causes memory and runtime costs in

the network. Memory costs are involved only if the

node has an internal memory.

Memory. To determine the memory costs of alpha

node x

with internal memory, let N(x

) be the set of

alpha NCNs of x

. Neglecting the counting ﬁelds the

memory costs of x

are given by |U(x

)|. Depending

on the implementation and size of the facts, the mem-

ory cost impact of counting ﬁelds varies. For now a

counting ﬁeld shall increase the size of a fact by 15%.

Result. The memory costs for alpha node results in

|U(x

)| (1+ 0,15· |N(x

)|) (7)

memory units.

Runtime. Below an attempt is made to give an esti-

mation for runtime costs for x

regarding new facts:

(8)

)

InsC

∑

−α

∈N(x

)

JoinC

−α

)

∑

∈N(x

−α

)

JoinC

−α (y

)

InsC

are the costs to ﬁlter and insert a fact into the

internal memory. These are 1 for applying the ﬁlter

if x

does not have an internal memory. Otherwise

the costs are increased by 1 for storing the fact in the

node appropriately, resulting in 2 runtime units.

If there are alpha NCNs for x

, upon inserting a fact

into x

it is joined with their facts. Let N(x

) be the

set of alpha NCNs of x

. JoinC

−α

) are the join

costs for the new fact in x

with the NCN y

−α

. These

are approximated by JC

,U(y

−α

)

(1).

As a NCN for other alpha nodes, x

−α

joins any in-

serted facts with the memories of its connected nodes.

Let N(x

−α

) be the set of nodes x

−α

is connected to.

JoinC

−α (y

) are the join costs for the new fact in x

−α

with the connected node y

. These are approximated

by JC

−α

,U(y

)

(1).

In the same manner an estimate for costs resulting

from deleting a fact is given:

(9)

)

DelC

+ CheckCounters

∑

∈N(x

−α

)

JoinC

−α (y

)

Here DelC

are the costs to ﬁlter and delete the fact

from the internal memory. These are 1 for applying

the ﬁlter if x

does not have an internal memory. Oth-

erwise the costs are increased by 1 for deleting the

fact, resulting in 2 runtime units.

If there are alpha NCNs for x

, it has to check the

RatingofDiscriminationNetworksforRule-basedSystems

fact’s counting ﬁelds before propagating it in the net-

work. For this action 1 runtime unit is assessed.

As a NCN for other alpha nodes, x

−α

joins any

deleted facts with the memories of its connected

nodes. Let N(x

−α

) be the set of nodes x

−α

is con-

nected to. JoinC

−α (y

) are the join costs for the

fact to be deleted in x

−α

with the connected node y

These are approximated by JC

−α

,U(y

)

(1).

Result. The runtime costs for an alpha node x

without internal memory are

) + F

) (10)

runtime units, those for an alpha node x

with internal

memory are

(11)

)

2 +

∑

∈N(x

−α

)

−α

,U(y

)

(1)

+ F

)

∑

∈N(x

−α

)

−α

,U(y

)

(1)

runtime units if x

does not have any NCNs and

(12)

)

2 +

∑

−α

∈N(x

)

,U(y

−α

)

(1)

∑

∈N(x

−α

)

−α

,U(y

)

(1)

+ F

)

∑

∈N(x

−α

)

−α

,U(y

)

(1)

runtime units if x

has NCNs.

Further Considerations The set of facts propa-

gated by x

is ﬁltered by its NCNs ad can be described

by U(x

) −

−α

∈N(x

)

U(x

) X y

−α

. As only those

facts are propagated appearing in none of the joins,

but as facts can easily appear in several joins, the set

of facts propagated can not just be estimated (similar

to the reverse triangle inequality) by

(13)

|U(x

)| −

∑

−α

∈N(x

)



U(x

) X y

−α



≤



U(x

) −

[

−α

∈N(x

)

U(x

) X y

−α



Following thoughts will elaborate how the number of

propagated facts can be estimated. The sum of the

counting ﬁelds of U(x

) regarding y

−α

can be ex-

pected to be |U(x

) Z y

−α

|. The estimate for a count-

ing ﬁeld of a fact in U(x

) is therefore

(14)

|U(x

) Z y

−α

|U(x



−α



· JSF(U(x

),y

−α

) .

The expected probability that a fact in U(x

) matches

a fact in y

−α

is therefore JSF(U(x

−α

)). The prob-

ability that a fact in U(x

) has no matching fact in

−α

can be calculated with

(15)



1 − JSF(U(x

),y

−α

)



−α

An estimate for the number of facts in U(x

) which

have zero in all counting ﬁelds is given by

(16)

| = |U(x

)| ·

∏

−α

∈N(x

)



− JSF(U(x

),y

−α

)



−α

Hereby an estimate for the number of propagatedfacts

has been found. Changes in alpha NCNs can lead

to changes in the counting ﬁelds and therefore to +/-

(TR) tokens to be propagated.

An estimate for the number of (TR) tokens is valu-

able for further considerations. Let z

−α

be an alpha

NCN. Further let N(x

) be the set of NCNs of x

, so

−α

∈ N(x

). Referencing (16) the number of facts in

U(x

) with zeros in all counting ﬁelds when a fact is

added to z

−α

can be estimated with:

(17)



1 − JSF(U(x

),z

−α

)



|U(x

∏

−α

∈N(x

)



1 − JSF(U(x

),y

−α

)



−α

The expectancy for the number of generated - (TR)

tokens after inserting a fact in z

−α

can be described

as the difference between (16) and (17):

(18)JSF(U(x

),z

−α

)|x

The expectancy for the number of generated + (TR)

tokens after deleting a fact in z

−α

can be calculated

analogue:

(19)

JSF(U(x

),z

−α

)

1 − JSF(U(x

),z

−α

)

With these thoughts an expectancy for the number of

generated +/- (TR) tokens can be given:

’(x

)

= F

)

|U(x

∑

−α

∈N(x

)

−α

)|x

JSF(U(x

),y

−α

)

1 − JSF(U(x

),y

−α

)

(20)

DATA2013-2ndInternationalConferenceonDataManagementTechnologiesandApplications

’(x

) = F

)

|U(x

∑

−α

∈N(x

)

−α

)|x

| JSF(U(x

),y

−α

)

(21)

Now one can estimate the frequency of +/- tokens that

reach and change x

from the PCN w

as follows:

(22a)F

) = Sel(x

)

(22b)F

) = Sel(x

)

Lacking a expectancy for facts in a node one can esti-

mate this analogue:

|U(x

)| = Sel(x

)|w

| (23)

4.2.4 Beta Node

A beta node always causes memory and runtime

costs. Let E(x

) be the (multi-)set of the nodes con-

nected to the inputs of x

. Further let E(x

) = P(x

)∪

N(x

) with N(x

) being the set of NCNs and P(x

) the

set of PCNs.

Preliminary. When a new fact reaches a beta node

it has to be joined with the facts of all other PCNs.

Below an estimate for the size of this join will be de-

veloped. The expectancy for the input y of node x

is JoinSize

(y).

For nodes lacking an edge in the join graph the JSF

values are always one. Let x

have n inputs. For the

input y it has a sorted join list of all inputs excluding

y. This join list consecutively numbers the inputs e

of x

beginning with index 2. e

always is y. The JSF

values are marked with the indices of the correspond-

ing inputs for the sake of clarity

(24)

JoinSize

) =

∏

k=2

| · JSF

+ (e

k−1

)

with e

= y

Memory.



U(x

)



is to be calculated. The mean of

all join sizes will give an estimate.



U(x

)



P(x

)



∑

∈P(x

)



JoinSize

)

(25)

This estimate is further ﬁltered by NCNs, see (16).



U(x

)



∏

−

∈N(y

)

(1 − JSF(U(x

),y

−

))

−

(26)

The size of the fact tuple equals the sum of the tu-

ples joined. Given the size the number of tuples per

memory page can be calculated.

(27a)T(x

) =

∑

∈P(x

)

T(y

)

(27b)TPP(x

) = TPP(root)/T(x

)

Result. The memory costs of a beta node without

NCNs and a variant I beta node (see 4.1.3) is given by



T(x

) . (28)

In a variant II beta node the counting ﬁelds further

increase the memory costs:

(29)



U(x

)





T(x

) + 0,15 ·



N(x

)





Runtime. The frequency of +/-(TR) tokens being

passed on from beta nodes is needed to estimate the

runtime costs. The frequency can be estimated simi-

lar to the frequency in the alpha network. As no esti-

mate about the bundling of facts in TR tokens can be

made without a distribution function the assumption

is made that a TR token only encapsulates one fact

tuple. Given this assumption the frequency can sim-

ply be multiplied with the expectancy for the number

of generated fact tuples.

For -TR tokens reaching a beta node from a PCN the

optimization described in 4.1.2 is used. Therefore

only the fact to be deleted is considered and not the

deleted elements. This optimization can not be used

for NCNs.

’(x

)



U(x

)



∑

∈P(x

)

’(y

) · JoinSize

)

∑

−

∈N(x

)

’(y

−

)



JSF(U(x

),y

−

)

1 − JSF(U(x

),y

−

)

(30a)

’(x

) =

∑

∈P(x

)

’(y

)

∑

−

∈N(x

)

’(y

−

)



JSF(U(x

),y

−

)

(30b)

The formula for JoinSize

(y) (24) only considers

PCNs of x

RatingofDiscriminationNetworksforRule-basedSystems

Algorithm 1: costPosInsVarI

): Expected

costs for handling a +TR token from a PCN of a vari-

ant I beta node or one without NCNs.

input : beta node x

, input y

of x

output: expected costs

begin

costs ← 0

size ← 1

← y

while not

empty(

joinList

)

is right join operand

←

pop(

joinList

)

costs ← costs+ JC

k−1

(size)

if r

is positive then

// i.e.

size ← size· JSF

k−1

) ·



else

// i.e.

−

size ←

−

)

|X|

|U(X)|

weighting

−

)

return costs

The runtime costs can be described with:

∑

∈P(x

)

’(y

) · costPosInput

)

+ F

’(y

) · costPosDel

)

∑

−

∈N(x

)

’(y

−

) · costNegInput

−

)

+ F

’(y

−

) · costNegDel

−

)

(31)

The costs vary depending on whether a variant I or

variant II beta node is used.

Variant I Beta Nodes (or without NCNs). The

costs for handling a -TR token from a PCN (costPos-

Del) are: As the result set has to be searched and the

matching entries have to be deleted, the costPosDel

are:

costPosDel

) = m(x

)

+ C(m(x

),JoinSize

))

(32)

The costs for handling a +TR token from a NCN

(costNegIns) can be approximated by JC

−

(1), as

the corresponding join can be performed efﬁciently

for NCNs.

The costs for handling a +TR token from a PCN

(costPosIns) can be approximated by 1.

The costs for handling a -TR token from a NCN

(costNegDel) can be approximated by 2 for variant I

beta nodes. For beta nodes without NCNs, no algo-

rithm is necessary. 2 and 1 are identical except for the

additional line in 2 and the positive / negative signs

Algorithm 2: costNegDelVarI

): Expected

costs for handling a -TR token from a NCN of a vari-

ant I beta node.

input : beta node x

, input y

−

of x

output: expected costs

begin

costs ← 0

size ← 1

← y

−

while not

empty(

joinList

)

is right join operand

←

pop(

joinList

)

costs ← costs+ JC

k−1

(size)

if r

is positive then

// i.e.

size ← size· JSF

−

k−1

) ·



else

// i.e.

−

size ←

−

)

|X|

|U(X)|

weighting

−

)

costs ← costs+ JC

−

(size)

// final join

return costs

marking the input node y and the node l. They have

only been separated for the sake of clarity.

A way to identify the

weighting

of a node is

still needed. Why the

weighting

is important was

discussed in 4.2.3: Fact tuples that originated from

the join of PCNs of a node can be culled by several

NCNs. This fact prevents using



U(x

)



−

∑

−

∈N(x

)



−



JSF(y

−

,U(x

))



U(x

)



to estimate



From now on this multiple culling is called over-

lap – as the ﬁlters from the different NCNs overlap.

The following holds when the overlap is maximal, i. e.

all culled fact tuples are ﬁltered by all NCNs:

max



−



JSF(y

−

,U(x

))



U(x

)



| y

−

∈ N(x

)



[

−

∈N(x

)

−

Z U(x

)



(33)

While the overlap is minimal, i. e. every culled fact

tuple is ﬁltered by only one NCN,

(34)

∑

−

∈N(x

)



−



JSF(y

−

,U(x

))



U(x

)



[

−

∈N(x

)

−

Z U(x

)



holds. As in general the data is not sufﬁcient to calcu-

late the overlap correctly, the mean between the min-

imal and maximal overlap will be used.

DATA2013-2ndInternationalConferenceonDataManagementTechnologiesandApplications

is the mean expectancy for the number of

matching fact tuples in the NCNs in the join list of

e (using its order). The indices of the nodes d

−

match

the indices of the join list of e. d

j→k−1

is the last PCN

with an index lower than k.

(35)

(e) =





max

−

∈N(x

)



−



JSF

j→k−1

−

)

∑

−

∈N(x

)



−



JSF

j→k−1

−

)





The

weighting

(e,z

−

) gives an estimate for the

number of matching fact tuples in the NCN z

−

after

handling all joins in the join list of e until reaching

−

weighting

(e,z

−

) = M

(e)



−



JSF

j→k−1

−

)

∑

−

∈N(x

)



−



JSF

j→k−1

−

)

(36)

Now we can deduce

|X|

|U(X)|

)

|X|

|U(X)|

)

|X|

|U(X)|

∑

−

∈N(x

)

weighting

(y,e

−

)

∏

−

∈N(x

)

|X|

|U(X)|

weighting

(y,e

−

)

(37)

Now a function for estimating the size of the join con-

sidering NCNs can be given:

JoinSize

(y)

∏

k=2













· JSF

j→k−1

) for e

)

|X|

|U(X)|

weighting

(y,e

−

)

otherwise

(38)

For variant I beta nodes or beta nodes without NCNs

the runtime costs are:

∑

∈P(x

)

’(y

) · costPosInsVarI

) + F

’(y

)



m(x

) +C(m(x

),JoinSize

))



∑

−

∈N(x

)

’(y

−

)

· JC

−

(1) + F

’(y

−

) · costNegDelVarI

−

)

(39)

Variant II Beta Node. The costs for handling a -TR

token from a PCN (costPosDel) can be approximated

by m(U(x

))+C(m(U(x

)),JoinSize

)) runtime

units, because the result set has to be searched and the

matching fact tuples have to be updated.

The costs for handling a +TR token from a NCN

(costNegIns) can be approximatedby 2·JC

−

,U(a

)

(1)

runtime units because the fact has to be joined with

the result set and the counting ﬁelds have to be up-

dated.

3 calculates the costs for handling a +TR token

from a PCN (costPosIns) and 4 the costs for handling

a -TR token from a NCN (costNegDel).

Algorithm 3: costPosInsVarII

): Estimated

costs for handling a +TR token from a PCN of a vari-

ant II beta node.

input : beta node x

, input y

of x

output: estimated costs

begin

costs ← 0

size ← 1

← y

while not

empty(

positiveJoinList

)

// R is right join operand

←

pop(

positiveJoinList

)

costs ← costs+ JC

k−1

(size)

size ← size· JSF

k−1

) ·



while not

empty(

negativeInputList

)

−

←

pop(

negativeInputList

)

costs ← costs+ JC

U(x

),n

−

(size)

// join

// save results and counting fields

costs ← costs+ size

return costs

Algorithm 4: costNegDelVarII

−

): Estimated

costs for handling a -TR token from a NCN of a vari-

ant II beta node.

input : beta node x

, input y

−

of x

output: estimated costs

begin

costs ← 0

size ← 1

← y

−

while not

empty(

positiveJoinList

)

// R is right join operand

←

pop(

positiveJoinList

)

costs ← costs+ JC

k−1

(size)

size ← size· JSF

−

k−1

) ·



// decrement counting fields

costs ← costs+ size

return costs

RatingofDiscriminationNetworksforRule-basedSystems

So the runtime costs of a variant II beta node are:

∑

∈P(x

)

’(y

) · costPosInsVarII

)

+ F

’(y

)



m(U(x

)) + C(m(U(x

)),JoinSize

))



∑

−

∈N(x

)

’(y

−

) · 2 · JC

−

,U(x

)

(1)

+ F

’(y

−

) · costNegDelVarII

−

)

(40)

4.2.5 Terminal Node

The runtime and memory costs of a terminal node can

be omitted as it only needs to set a ﬂag if and only

if the connected beta node x

has fact tuples in its

internal memory.

5 CONCLUSIONS

In this paper a general structure for DNs has been de-

veloped allowing for a consistent rating. For the com-

ponents of DNs in this structure cost functions have

been worked out. By rating the normalised DN every

DN can be rated.

In the course of the rating simpliﬁcations and es-

timations had to be made for several reasons. These

include the estimates for means of the overlap or the

simpliﬁed fact sizes. Both quantities could have been

declared as necessary input parameters, but the im-

provement of the cost estimates don’t seem to com-

pensate the additional expenses for the one using the

rating function. The same holds for the severe sim-

pliﬁcation regarding the JSFs. To force the speciﬁ-

cation of all values means increasing the amount of

base values needed vehemently forfeiting the abstract

character of the algorithm.

On the other hand all simpliﬁcations made can be

replaced by the correct values with little effort allow-

ing for a more precise rating.

REFERENCES

Brant, D., Grose, T., Lofaso, B., and Miranker, D. (1991).

Effects of Database Size on Rule System Perfor-

mance:Five Case Studies. In Proceedings of the 17th

International Conference on Very Large Data Bases

(VLDB).

Brownston, L., Farrell, R., Kant, E., and Martin, N. (1985).

Programming expert systems in OPS5: an introduc-

tion to rule-based programming. Addison-Wesley

Longman Publishing Co., Inc., Boston, MA, USA.

Cárdenas, A. F. (1975). Analysis and performance

of inverted data base structures. Commun. ACM,

18(5):253–263.

Forgy, C. L. (1981). OPS5 User’s Manual. Technical report,

Department of Computer Science, Carnegie-Mellon

University.

Forgy, C. L. (1982). Rete: A fast algorithm for the many

pattern/many object pattern match problem. Artiﬁcial

Intelligence, 19(1):17 – 37.

Hanson, E. N. (1993). Gator: A Discrimination Network

Structure for Active Database Rule Condition Match-

ing. Technical report, University of Florida.

Hanson, E. N. and Hasan, M. S. (1993). Gator: An Op-

timized Discrimination Network for Active Database

Rule Condition Testing. Technical report, University

of Florida.

Jamocha (2006). Jamocha Project Page. http://

www.jamocha.org, http://sourceforge.net/projects/

jamocha.

Miranker, D. P. (1987). TREAT: A Better Match Algorithm

for AI Production Systems; Long Version. Techni-

cal report, University of Texas at Austin, Austin, TX,

USA.

Winston, P. H. (1992). Artiﬁcial intelligence. Addison-

Wesley.

Yao, S. B. (1977). Approximating block accesses in

database organizations. Commun. ACM, 20(4):260–

261.

DATA2013-2ndInternationalConferenceonDataManagementTechnologiesandApplications