ing patterns, contrasts sets and frequent pattern-based
classification ((Dong and Li, 1999; Zhang et al., 2000;
Bay and Pazzani, 2001; De Raedt and Kramer, 2001;
Cheng et al., 2008), to cite a few), our task consider-
ably differs from the mentioned ones. First, we notice
that, to a closer look, the knowledge mined by the
techniques we are presenting below is actually dif-
ferent. Indeed, emerging patterns, contrast sets and
discriminative patterns can be well represented in the
form of rules, but the only attribute allowed to oc-
cur in their heads is the class attribute, wheras we
search for generic rules with any attribute in their
head, while the class attribute is not considered at
all. Moreover, the interestingness measure charac-
terizing patterns searched for in the cited literature
is based on measuring the frequency gap for the pat-
tern in the two classes, while we use the confidence
gap. While the former measures are (anti-)monotonic
with respect to pattern generality, the latter one is non-
monotonic and, hence, much more challenging to deal
with. Also, these patterns tend to capture knowledge
characterizing the data in a global sense, since they
are based on the notion of absolute frequency. Con-
versely, the knowledge mined by means of discrim-
inating rules characterizes the data in a local sense.
Indeed, the confidence is related to the frequency of
the condition in the head of a rule in the subpopula-
tion of the data selected by its body. Finally, we define
an innovative preference relation based on a statistical
significance test, while most pattern discovery meth-
ods prefer patterns on the basis of generality and/or
measure maximization.
As already noted, the technique presented here
can be regarded as an extension to groups of anoma-
lies of the technique presented in (Angiulli et al.,
2009). Indeed, being the confidence insensitive to ab-
solute frequency, it is more suitable for characterizing
unbalanced subpopulations, as usually occurs when
a group of anomalous individuals is compared to a
whole normal population, than the support. The ma-
jor differences between this work and (Angiulli et al.,
2009) are as follows. In this work two subpopulations
are compared, while in (Angiulli et al., 2009) only a
single (outlier) object can be compared with the over-
all (normal) population; the discriminating measure
adopted there is very different from the one developed
here, since it is designed for a single object, and it is
not at all clear ho to generalize it, if even possible, to
deal with more than a very limited number of anoma-
lous individuals.
The rest of the work is organized as follows. Sec-
tion 2 presents preliminary definitions. Section 3 de-
fines discriminating rule. Section 4 introduces the
notion of outstanding discriminating rule. Section 5
describes the DRUID algorithm for mining outstand-
ing rules. Section 6 presents experimental results. Fi-
nally, Section 7 concludes the work.
2 PRELIMINARIES
In this section some preliminary notions are pre-
sented.
Let A = {a
1
, . . . , a
m
} be a set of attributes and T
a database on A (multi-set of tuples on A). A simple
condition c on A is an expression of the form a = v,
where a ∈ A and v belongs to the domain of a. A
condition C on A is a conjunction c
1
∧ . . . ∧ c
k
of k
(k ≥ 0) simple conditions on A. A condition with
k = 0 is called an empty condition. In the following,
for a conditionC of the form c
1
∧. . .∧c
k
, cond(C) de-
notes the set of simple conditions {c
1
, . . . , c
k
}, while
attr(C) denotes the set {a
i
| (a
i
= v
i
) ∈ C}, that is the
subset of attributes of A appearing in simple condi-
tions c
i
of C.
Let T be a database on a set of attributes A, let t
be a tuple of T. Let c ≡ a = v be a simple condition
on A. The tuple t satisfies c iff t[a] = v, where t[a]
denotes the value the tuple t assumes on a. Let C be
a condition on A. The tuple t satisfies C iff t satis-
fies each simple condition c
i
of C. If C is an empty
condition then each tuple t satisfies C. T
C
denotes the
database including the tuples of T which satisfy C.
Let A = {a
1
, . . . , a
m
} be a set of attributes, a rule
on A is an expression of the form B ⇒ h, where B is
a condition on A and h is a simple condition on A.
B and h are called the body and the head of the rule,
respectively. The size of the rule R ≡ B ⇒ h, denoted
by |R|, is the cardinality of the set cond(B). Let T be
a database on a set of attributes A, let t be a tuple of
T, and let R ≡ B ⇒ h be a rule on A. t satisfies R iff
t satisfies B ∧ h. Let R ≡ B ⇒ h and R
′
≡ B
′
⇒ h
′
be
two rules such that h = h
′
and cond(B) ⊃ cond(B
′
).
Then R is said to be a superrule of R
′
and R
′
is said to
be a subrule of R.
Let T be a database on a set of attributes A, and let
C be a condition on A. The support ofC in T, denoted
by sup
T
(C), is the ratio
|T
C
|
|T|
of the number of tuples
of T satisfying C over the size of T. Given a database
T on A and a threshold σ, 0 ≤ σ ≤ 1, a condition C is
said to be σ-supported by T iff sup
T
(C) ≥ σ.
Let T be a database on a set of attributes A, and
let R be a rule B ⇒ h on A. The confidence of R in T,
denoted by cnf
T
(R), is the ratio
|T
B∧h
|
|T
B
|
of the number
of tuples of T satisfying R over the number of tuples
satisfying B.
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
170