PREFERENCE RULES IN DATABASE QUERYING

Sergio Greco, Cristian Molinaro and Francesco Parisi

DEIS, University of Calabria, 87036 Rende, Italy

Keywords:

Deductive databases, prioritized queries, preferences.

Abstract:

The paper proposes the use of preferences for querying databases. In expressing queries it is natural to express

preferences among tuples belonging to the answer. This can be done in commercial DBMS, for instance, by

ordering the tuples in the result. The paper presents a different proposal, based on similar approaches deeply

investigated in the artiﬁcial intelligence ﬁeld, where preferences are used to restrict the result of queries posed

over databases. In our proposal a query over a database D B is a triple hq, P , Φi, where q denotes the output

relation, P a Datalog program (or an SQL query) used to compute the result and Φ is a set of preference

rules used to introduce preferences on the computed tuples. In our proposal tuples which are ”dominated” by

other tuples do not belong to the result and cannot be used to infer other tuples. A new stratiﬁed semantics is

presented where the program P is partitioned into strata and the preference rules associated to each stratum

of P are divided into layers; the result of a query is carried out by computing one stratum at time and by

applying the preference rules, one layer at time. We show that our technique is sound and that the complexity

of computing queries with preference rules is still polynomial.

1 INTRODUCTION

The growing volume of available information poses

new challenges to the database and artiﬁcial intelli-

gence communities. Recent researches have investi-

gated new techniques in accessing large volumes of

data such as user-centered access to information, in-

formation ﬁltering and extraction and policies to re-

duce data presented to users. An interesting direction

deeply studied in the artiﬁcial intelligence and non-

monotonic reasoning ﬁelds consists in the use of pre-

ferences to express priorities on the alternative sce-

narios.

The paper presents a logical framework wherein pre-

ferences are used to restrict the result of queries posed

over a database. This is an important aspect in query-

ing large databases such as those used by search en-

gines. In this context, the result of a query contains

only tuples which are not dominated by other tuples

and dominated tuples cannot be used to infer new in-

formation. The novelty of the presented approach is

that preferences are stratiﬁed and applied one stratum

at time. A second innovative aspect of this proposal

is that preferences on both base and derived atoms

are considered as well as general (recursive) queries

which can be expressed by means of stratiﬁed Data-

log.

Example 1 Consider a database

D B =

{fish, beef} and a program P consisting of

the two rules:

red-wine ← beef

white-wine ← fish

Assume now to have a query deﬁned by the rules in

P and the preference

= red-wine ≻ white-wine ← beef

stating that if there is beef, we prefer red-wine to

white-wine. The set of preferred atoms contains

the base atoms fish and beef and the derived atom

red-wine (the atom white-wine is not preferred).

Assume now to also have the preference ρ

= fish ≻

beef stating that we prefer fish to beef. In this

case, ﬁrst the preference rule ρ

, and next the pre-

ference rule ρ

, are considered. However, ρ

can-

not be applied as beef is not in the preferred set

119

Greco S., Molinaro C. and Parisi F. (2007).

PREFERENCE RULES IN DATABASE QUERYING.

In Proceedings of the Ninth International Conference on Enterprise Information Systems - DISI, pages 119-124

DOI: 10.5220/0002389901190124

 SciTePress

of atoms. Consequently, the set of preferred atoms,

with respect to the preference rules ρ

and ρ

, is

{fish, white-wine}. 

Contributions. In this paper we study the use of

preferences in querying databases. We consider gen-

eral (stratiﬁed) Datalog queries and general preferen-

ces: the head of preference rules may contain atoms

belonging to different relations and the body consists

of a conjunction of literals. A semantics where both

query and preferences are partitioned into strata is

deﬁned. Under such a semantics, the query is com-

puted one stratum at time and for each stratum (of

the query), the preferences are applied one stratum at

time.

Related Work. The increased interest in preferen-

ces in logic programs is reﬂected by an extensive

number of proposals and systems for preference han-

dling. Most of the approaches propose an extension

of logic programming by adding preference informa-

tion. The most common form of preference consists

in specifying a strict partial order on rules (Delgrande

et al., 2003; Gelfond and Son, 1997; Sakama and

Inoue, 2000; Zhang and Foo, 1997), whereas more

sophisticated forms of preferences also allow priori-

ties to be speciﬁed between conjunctive (disjunctive)

knowledge with preconditions (Brewka et al., 2003;

Sakama and Inoue, 2000) and numerical penalties for

suboptimal options (Brewka, 2004).

Considering the use of preferences in querying

databases, an extension of relational calculus express-

ing preferences for tuples in terms of logical con-

ditions has been proposed in (Lacroix and Lavency,

1987). Preferences requiring non-deterministic

choice among atoms which minimize or maximize the

value of some attribute has been proposed in (Greco

and Zaniolo, 2002). An extension of Datalog with

preference relations, subsuming the approach propo-

sed in (Lacroix and Lavency, 1987), has been pro-

posed in (Kostler et al., 1995), whereas an exten-

sion of SQL including preferences has been propo-

sed in (Kieβling, 2002; Kieβling and Kostler, 2002).

In the last proposal several built-in operators and a

formal deﬁnition of their combinations (i.e. intersec-

tion, union, Pareto composition, etc.) has been con-

sidered. Borzsonyi et al. proposed the skyline opera-

tor (Borzsonyi et al., 2001), to ﬁlter out a set of “inter-

esting” point (i.e. not dominated by any other point)

from a potential large set of points. An extension of

SQL with a skyline operator has been also proposed.

A framework for specifying preferences using logical

formulas and its embedding into relational algebra has

been introduced in (Chomicki, 2003). The paper also

introduces the winnow operator which generalizes the

skyline operator. The implementation of winnow and

ranking is also studied in (Torlone and Ciaccia, 2002).

Algorithms for computing skyline operators are also

studied in (Kossmann et al., 2002; Papadias et al.,

2003; Chomicki et al., 2003). In (Agrawal and Wim-

mers, 2002) the use of quantitative preferences (scor-

ing functions) in queries is proposed.

In this work, in contrast with previous proposals, gen-

eral preferences and a different (stratiﬁed) semantics,

which we believe to be more intuitive, are considered.

2 BACKGROUND

Familiarity with disjunctive logic programs and di-

sjunctive deductive databases is assumed (Ullman,

1988).

Datalog Programs. A term is either a constant or a

variable. An atom is of the form p(t

, . . . , t

), where

p is a predicate symbol of arity h and t

, . . . , t

are

terms. A literal is either an atom A or its negation

not A. A (Datalog) rule r is a clause of the form

A ← B

, ..., B

, not B

m+1

, ..., not B

, ϕ n ≥ 0

where A, B

, . . . , B

are atoms, whereas ϕ is a conjunc-

tion of built-in atoms of the form uθv where u and v

are terms and θ is a comparison predicate. A is the

head of r (denoted by Head(r)), whereas the conjunc-

tion B

, ..., B

, not B

m+1

, ..., not B

, ϕ is the body of r

(denoted by Body(r)). It is assumed that each rule

is safe, i.e. a variable appearing in the head or in a

negative literal also appears in a positive body literal.

A (Datalog) program is a ﬁnite set of rules. A not-free

program is called positive. The Herbrand Universe

of a program

P is the set of all constants appear-

ing in

P , and its Herbrand Base B

is the set of all

ground atoms constructed from the predicates appear-

ing in

P and the constants from U

. A term (resp. an

atom, a literal, a rule or a program) is ground if no

variable occurs in it. A rule r

′

is a ground instance

of a rule r if r

′

is obtained from r by replacing every

variable in r with some constant in

; ground(

P )

denotes the set of all ground instances of the rules in

P .

An interpretation M for a Datalog program

P is any

subset of

; M is a model of

P if it satisﬁes all rules

in ground(

P ). The (model-theoretic) semantics for

positive

P assigns to P the set of its minimal models

M M (P ), where a model M for P is minimal if no

proper subset of M is a model for

P . For any interpre-

tation M,

is the ground positive program derived

from ground(

P ) by 1) removing all rules that contain

ICEIS 2007 - International Conference on Enterprise Information Systems

120

a negative literal notA in the body and A ∈ M, and

2) removing all negative literals from the remaining

rules. An interpretation M is a stable model of

P if

and only if M ∈

M M (P

) (Gelfond and Lifschitz,

1988). For general

P , the stable model semantics as-

signs to

P the set SM (P ) of its stable models. It is

well-known that stable models are minimal models

(i.e. SM (P ) ⊆ M M (P )) and that for negation free

programs minimal and stable model semantics coin-

cide (i.e.

SM (P ) = M M (P )).

Given a Datalog program

P , G

(

P ) = (V

, E

)

denotes the dependency graph associated with

ground(

P ) where V

consists of all ground atoms ap-

pearing in ground(

P ), whereas there is an arc from

B to A in E

if there is a rule r in ground(

P ) such

that Head(r) = A and B ∈ Body(r); the arc is said to

be marked negatively if B appears negated in the body

of r. The dependency graph

G (P ) = (V, E) associated

with

P is built by considering the ground program de-

rived from

P by eliminating all terms (i.e. every atom

p(t) is replaced by p). A ground atom p(t) depends

on a ground atom q(u) if there is a path in G

(

P )

from q(u) to p(t). Analogously, a predicate symbol p

depends on a predicate symbol q if there is a path in

G (P ) from q to p. The dependency is negated if there

is an arc marked negatively in the path.

A partition π

, . . . , π

of the set of all predicate sym-

bols of a Datalog program

P , where each π

is called

stratum, is a stratiﬁcation of

P if for each rule r in P

the predicates that appear only positively in the body

of r are in strata lower than or equal to the stratum of

the predicate in the head of r, and the predicates that

appear negatively are in strata lower than the stratum

of the predicate in the head of r. The stratiﬁcation of

the predicates deﬁnes a stratiﬁcation of the rules of

into strata hP

, . . . ,

i where a stratum

contains

rules which deﬁne predicates in π

. A Datalog prog-

ram is called stratiﬁed if it has a stratiﬁcation. Strat-

iﬁed (normal) programs have a unique stable model

which coincides with the stratiﬁed model, obtained

by computing the ﬁxpoints of every stratum in their

order.

Queries. Predicate symbols are partitioned into two

distinct sets: base predicates and derived predicates.

Base predicates correspond to database relations de-

ﬁned over a given domain and they do not appear in

the head of any rule, whereas derived predicates are

deﬁned by means of rules. Given a set of ground

atoms

D B , a predicate symbol p and a stratiﬁed prog-

ram

P , D B [p] denotes the set of p-tuples in D B ,

while

D B

denotes the program derived from the

union of

P with the facts in D B , i.e. P

D B

P ∪ D B .

The semantics of

D B

is given by the stratiﬁed mo-

del (which coincide with the unique stable model)

D B

. The answer to a query Q = (g,

P ) over a

database

D B , denoted by Q(D B ), is given by M [g]

where

M = SM (P

D B

). In the following we also de-

note with

P (D B ) = SM (P

D B

) the application of

to D B ; therefore Q(D B ) = P (D B )[g].

3 PREFERENCE RULES AND

QUERIES

This section presents a framework for expressing pre-

ferences in the evaluation of queries posed on a given

database. The framework is based on the introduc-

tion of preference rules, whose syntax is inspired to

the management of priorities in the artiﬁcial intelli-

gence ﬁeld, logic programming and database query-

ing (Brewka et al., 2003; Delgrande et al., 2003; Gel-

fond and Son, 1997; Sakama and Inoue, 2000; Zhang

and Foo, 1997).

3.1 Syntax

A prioritized program consists of a set of stan-

dard rules (Datalog program) and a set of preference

rules. As rules expressing preferences eliminate tu-

ples which are derived by means of standard rules

(Datalog program) we ﬁrst introduce a standard strat-

iﬁcation of the Datalog program to ﬁx the order in

which standard rules are applied. Preference rules are

associated to each subprogram (stratum) and applied

after the subprogram has been evaluated. Let start by

introducing the concept of standard stratiﬁcation.

Deﬁnition 1 The standard stratiﬁcation of a strati-

ﬁed program

P consists of k strata hP

, ...,

i where

k is the minimal value such that for each

and for

each pair of predicates p and q deﬁned in

either

they are mutually recursive or they are independent

(i.e. p does not depend on q and q does not depend on

p). 

In the following, given an atom p(t), str(p(t)) de-

notes the stratum of the predicate symbol p (or equiv-

alently of the subprogram in which p is deﬁned) in the

standard stratiﬁcation.

Deﬁnition 2 A preference rule ρ is of the form:

A ≻ C ← B

, ..., B

, not B

m+1

, ..., not B

, ϕ (1)

where where A,C, B

, . . . , B

are atoms, and ϕ is a

conjunction of built-in atoms. 

Also in this case we assume that rules are safe. In

the above deﬁnition A ≻ C is called head of the pre-

ference rule (denoted as Head(ρ)), whereas the con-

junction B

, ..., B

, not B

m+1

, ..., not B

, ϕ is called

PREFERENCE RULES IN DATABASE QUERYING

121

body (denoted as Body(ρ)). Moreover, we denote

with Head

(ρ) and Head

(ρ) the ﬁrst and the second

atom in the head of ρ, respectively (i.e. Head

(ρ) = A

and Head

(ρ) = C).

The intuitive meaning of a ground preference rule ρ is

that if the body of ρ is true, then the atom A is prefer-

able to C (we also say that the atom C is dominated

by the atom A). This means that in the evaluation of a

prioritized program h

P , Φi the model deﬁning its se-

mantics cannot contain the atom C if it contains the

atom A and the body of the preference rule is true.

Let Φ be a preference program, i.e. a set of pre-

ference rules. The transitive closure of ground(Φ)

is Φ

∗

= ground(Φ) ∪ {(A ≻ C ← body

, body

∃A ≻ B ← body

∈ Φ

∗

∧ ∃ B ≻ C ← body

∈ Φ

∗

Analogously, we deﬁne Φ

∗

as the closure of the set of

ground preference rules derived from Φ by replacing

every atom p(t) with p and deleting built-in atoms.

Deﬁnition 3 A (ground) preference program Φ

∗

layered if it is possible to partition it into n layers

hΦ

∗

[1], . . . , Φ

∗

[n]i as follows:

• For each ground atom A such that there is no

ground rule ρ ∈ Φ

∗

such that Head

(ρ) = A,

layer(A) = 0;

• For every ground atom C such that there is a rule

ρ of the form (1) (i.e. such that Head

(ρ) = C),

layer(C) > max{layer(B

), . . . , layer(B

), 0} and

layer(C) ≥ layer(A);

• The layer of a preference rule ρ ∈ Φ

∗

, denoted as

layer(ρ), is equal to layer(Head

(ρ));

• Φ

∗

[i] consists of all preference rules associated

with the layer i. 

Example 2 Consider the set of preference rules Φ:

: fish ≻ beef ←

: red-wine ≻ white-wine ← beef

: white-wine ≻ red-wine ← fish

The transitive closure Φ

∗

consists of the rules

, ρ

plus the following rules

: red-wine ≻ red-wine ← beef, fish

: white-wine ≻ white-wine ← fish, beef

∗

is partitioned into the two layers Φ

∗

[1] = {ρ

} and

∗

[2] = {ρ

, ρ

}. 

As it will be clear in the next subsection, preference

rules of the form A ≻ A ← body are useless and can

be deleted. Therefore, in the above example Φ

∗

[2] =

{ρ

, ρ

Example 3 Consider the set of preference rules Φ:

: fish ≻ beef ← white-wine

: red-wine ≻ white-wine ← beef

According to ρ

the layer of beef must be greater

than the layer of white-wine, whereas according to

the layer of white-wine must be greater than the

layer of beef. Thus, the set of preference rules is not

layered. 

Observe that in the above deﬁnition, in order to com-

pute the closure of the ground instantiation of Φ, we

need to know the database

D B containing all con-

stants in the database domain. Therefore, checking

whether Φ

∗

can be partitioned into layers cannot be

done at compile-time. It is possible to deﬁne sufﬁ-

cient conditions which guarantee that the set of prefe-

rence rules can be partitioned into layers by consider-

ing the (ground) program Φ

∗

instead of the program

∗

. This means that if Φ

∗

can be partitioned into lay-

ers, the set Φ

∗

can be partitioned into layers as well,

although the layers of Φ

∗

may be different from the

layers of Φ

∗

(the layers of Φ

∗

deﬁne a “reﬁnement”

of the layers of Φ

∗

Deﬁnition 4 A prioritized query is of the form

hq,

P , Φi where q is a predicate symbol denoting the

output relation,

P is a (stratiﬁed) Datalog program

and Φ is a set of preference rules. 

As said before, the intuitive meaning of a prioritized

query hq,

P , Φi over a database D B is that the atoms

derived from

P and D B must satisfy the preference

conditions deﬁned in Φ.

Deﬁnition 5 A prioritized query Q = hq,

P , Φi is said

to be well formed if Φ

∗

is layered and for every

ground atom C such that there is a rule ρ of the form

(1) (i.e. such that Head

(ρ) = C) it holds that

1. str(C) ≥ max{str(A), str(B

), . . . , str(B

)}, and

2. A, B

, ..., B

do not depend on C in

P . 

In the following we assume that our queries are well

formed. Sufﬁcient conditions can be deﬁned on the

base of the dependency graph

G (P ).

3.2 Semantics

First we analyze the case where Φ deﬁnes preferen-

ces on databases atoms and next we consider the case

where Φ expresses preferences on base and derived

atoms, i.e. also on atoms deﬁned in

P .

3.2.1 Preferences On Base Atoms

It is assumed here to have a query Q = hq, P ,Φi and

that the preference rules in Φ express preferences only

among base atoms. As said before, Φ

∗

can be parti-

tioned into n layers

∗

= hΦ

∗

[1], ..., Φ

∗

[n]i.

Deﬁnition 6 Let

D B be a set of ground atoms,

Φ a set of preference rules such that

∗

hΦ

∗

[1], ..., Φ

∗

[n]i, and t, u two atoms in

D B . We say

ICEIS 2007 - International Conference on Enterprise Information Systems

122

that t is preferable to u with respect to Φ

∗

[i] (denotes

as t ⊐

Φ[i]

u) if

• ∃(t ≻ u ← body

) ∈ Φ

∗

[i] s.t.

D B |= body

, and

• 6 ∃(u ≻ t ← body

) ∈ Φ

∗

[i] s.t. D B |= body

The set of tuples in D B which are preferred with re-

spect to Φ

∗

[i] is Φ

∗

[i](

D B ) = {t | t ∈ D B ∧ 6 ∃u ∈

D B s.t. u ⊐

Φ[i]

t}. 

Observe that Φ

∗

could contain preference rules of the

form A ≻ A ← body. Such preferences are useless as

they are not used to infer preferences among ground

atoms and can be deleted from Φ

∗

Example 4 Consider the database

D B = {fish,

beef, red-wine, white-wine, pie,ice-cream} and

the following preference rules Φ:

: pie ≻ ice-cream ←

: red-wine ≻ white-wine ← fish

: white-wine ≻ red-wine ← beef

The set Φ

∗

consists, without considering useless

rules, of a unique layer Φ

∗

[1] = {ρ

, ρ

}. The ap-

plication of Φ

∗

[1] to

D B gives the set Φ

∗

[1](

D B ) =

{fish, beef, red-wine, white-wine, pie} 

Deﬁnition 7 Let

D B be a database and Q = hq, P , Φi

be a query such that Φ expresses preferences only

on base atoms and the set of ground preference rules

∗

is layered into

∗

= hΦ

∗

[1], ..., Φ

∗

[n]i. Then the

set of preferred tuples with respect to

∗

M =

P (

∗

(

D B ))

P (Φ

∗

[n](Φ

∗

[n− 1]·· ·(Φ

∗

[1](D B ))···))).

The answer to the query Q is given by M [q]. 

Example 5 Consider the database

D B = {fish,

beef, red-wine, white-wine, pie} and the prefe-

rence rules Φ of Example 2. Φ

∗

is equal to Φ and it is

layered into

∗

= hΦ

∗

[1], Φ

∗

[2]i = h{ρ

}, {ρ

, ρ

The application of Φ

∗

[1] to

D B gives the set M

∗

[1](

D B ) = {fish, red-wine, white-wine, pie}

The application of Φ

∗

[2] to

gives the set

∗

[2](

) = {fish,white-wine, pie} 

3.2.2 General Preferences

We consider now general prioritized queries Q =

hq,

P , Φi where P is a stratiﬁed Datalog program and

Φ expresses preferences also on derived atoms.

Let hq,

P , Φi be a prioritized query and D B a

database. Let h

, . . . ,

i be the standard stratiﬁca-

tion of ground(

P ) and let P

= {A ← | A ∈

D B }.

Then, Φ

∗

[

], for i ∈ [0..k], denotes the following set

of preference rules in Φ

∗

[

] = {A ≻ C ← body | ∃(C ← body

′

) ∈

}

Deﬁnition 8 Let

D B be a database and let Q =

hq,

P , Φi be a prioritized query and hP

, ...,

i the

standard stratiﬁcation of

P . The application of P and

Φ to

D B is deﬁned as follows: M

∗

[

](

D B )

and for each i in [1..k], M

∗

[

](

i−1

)).

The answer to the query Q over the database

D B , de-

noted as Q(

D B ), is given by M

[q]. 

Our proposal is sound, i.e. for each ground preference

rule A ≻ C ← body in Φ

∗

, if M

|= (body ∧ A) then

6|= C. Moreover, it can be shown that the compu-

tational complexity of Q(

D B ) is polynomial time.

4 CONCLUSIONS

This paper has introduced prioritized queries, a form

of queries well-suited for expressing preferences

among tuples either belonging to the source database

or derived by means of the program speciﬁed in the

query. It has been shown that prioritized queries are

well-suited to express queries wherein we are inter-

ested only in preferred tuples. A stratiﬁed semantics

for computing prioritized queries has been presented

where the program

P is partitioned into strata and the

preference rules associated to each stratum of

P are

divided into layers; a query is evaluated by computing

one stratum at time and by applying the preference

rules, one layer at time. The computational comple-

xity of computing prioritized queries remains polyno-

mial.

REFERENCES

Agrawal, R., and Wimmers, E. L. (2002). A framework

for expressing and combining preferences. Proc. SIG-

MOD, pp. 297-306.

Borzsonyi S., Kossmann D., Stocker K. (2001). The skyline

operator, Proc. ICDE, 421-430.

Brewka, G. (2004). Complex Preferences for Answer Set

Optimization, KR, 213-223.

Brewka G., Niemela I., Truszczynski M. (2003). Answer

Set Optimization. IJCAI, 867-872.

Chomicki, J. (2003). Preference Formulas in Relational

Queries. ACM TODS, 28(4), 1-40.

Chomicki, J., Godfrey, P., Gryz, J., and Liang, D. (2003).

Skyline with presorting. Proc. ICDE.

Delgrande, J., P., Schaub, T., Tompits, H. (2003). A Frame-

work for Compiling Preferences in Logic Programs.

TPLP, 3(2), 129-187.

Gelfond, M., Son, T.C. (1997). Reasoning with prioritized

defaults. LPKR, 164-223.

PREFERENCE RULES IN DATABASE QUERYING

123

Gelfond, M., Lifschitz, V. (1988). The Stable Model Se-

mantics for Logic Programming, ICLP.

Greco S., Zaniolo C. (2002). Greedy by Choice, Proc.

PODS.

Kieβling, W. (2002). Foundations of preferences in

database systems, Proc. VLDB.

Kieβling, W., Kostler, G. (2002). Preference SQL - Design,

Implementation, experience, VLDB.

Kossmann, D., Ramsak, F., and Rost, S. (2002). Shoot-

ing stars in the sky: An online algorithm for skyline

queries. Proc. VLDB.

Kostler, G., Kieβling, W., Thone, H., Guntzer, U. (1995).

Fixpoint iteration with subsumption in deductive

databases. JIIS, 4, 123-148.

Lacroix M., Lavency P.(1987). Prefences: Putting More

Knowledge Into Queries. VLDB, 217-225.

Papadias, D., Tao, Y., Fu, G., and Seeger, B. (2003). An

optimal and progressive algorithm for skyline queries,

Proc. SIGMOD, pp. 467-478.

Sakama, C., Inoue, K. (2000). Priorized logic programming

and its application to commonsense reasoning. Artiﬁ-

cial Intelligence, 123, 185-222.

Torlone, R., Paolo Ciaccia. (2002). Finding the Best when

it’s a Matter of Preference, Proc. SEBD, pp. 347-360.

Ullman, J. K. (1988). Principles of Database and

Knowledge-Base Systems, Vol. 1, Computer Science

Press.

Zhang, Y., Foo, N. (1997). Answer sets for prioritized logic

programs. ILPS, 69-83.

ICEIS 2007 - International Conference on Enterprise Information Systems

124