Observing it, one can see that the number of rules ob-
tained with support 0.30 and confidence 0.70 for the
original dataset (without negated attributes) is 4:
[c: 0.74 s: 0.45] forum ⇒ organizer
[c: 0.97 s: 0.34] content-page ⇒ organizer
[c: 0.89 s: 0.32] assignment ⇒ organizer
[c: 0.81 s: 0.81] ⇒ organizer
The rules obtained are not very informative, basi-
cally saying that whatever they do, students also visit
the organizer page. But this does not come as a sur-
prise, given the fact that “organizer” is the main front
page of the course.
Clearly, these rules cannot offer information about
which are the resources less used, like the ones ob-
tained by using the neg-expanded dataset. Here are
some of the 34 rules mined at support 0.3, confidence
0.7 and confidence boost 1.05 with negated attributes:
[c: 0.70 s: 0.70] ⇒ announcement=0 file-manager=0
calendar=0 learning-objectives=0 student-bookmark=0
web-link=0 who-is-online=0 chat=0
[c: 0.86 s: 0.30] content-page=1 ⇒ announcement=0
chat=0 organizer=1 student-bookmark=0
[c: 0.84 s: 0.30] content-page=1 ⇒ chat=0 student-
bookmark=0 mail=0
[c: 0.86 s: 0.30] content-page=1 ⇒ my-grades=0 orga-
nizer=1 student-bookmark=0
[c: 0.88 s: 0.31] content-page=1 ⇒ who-is-online=0
organizer=1 student-bookmark=0
[c: 0.86 s: 0.30] content-page=1 ⇒ organizer=1 web-
link=0 student-bookmark=0
[c: 0.85 s: 0.30] content-page=1 ⇒ calendar=0 chat=0
organizer=1 student-bookmark=0
[c: 0.85 s: 0.30] content-page=1 ⇒ chat=0 organizer=1
student-bookmark=0 file-manager=0
[c: 0.85 s: 0.30] content-page=1 ⇒ chat=0 organizer=1
who-is-online=0
In the first rule, one can see that there are many
resources that are scarcely used, like the chat or the
announcement page; therefore, if the instructor has
something important to communicate to the students,
the best option would be to put it in the forum (a re-
source known to be accessed more often from the pos-
itive rules). Furthermore, one may note that when the
students connect to the platform in order to study (i.e.,
when they visit content-page resources), they do not
visit the chat, their bookmark or email.
A further, similar analysis of this dataset, also by
comparison with a different one with similar origin
and quite different characteristics, in terms of associ-
ation rules with negations and high confidence boost,
and including an additional pruning heuristic, is de-
scribed in (Balc
´
azar et al., 2010).
5 CONCLUSIONS AND FUTURE
WORK
In many practical applications, the output of a data
mining process could greatly benefit from adding to
the dataset the “negated” versions of the attributes.
One of the problems that arises though is that the
resulting set of rules mined is huge, making human
interpretation unfeasible. In this paper we propose
to use a recently introduced notion called confidence
boost that is able to filter out those rules that are not
“novel”, by quantifying to what extent the informa-
tion in each association rule “looks different” from
that of the rest of the rules. Our implementation em-
ploys the open-source closure miner from (Borgelt,
2003), and is available at slatt.googlecode.com.
As future work, we would like to look into (mathe-
matical and practical) ways of pushing the confidence
boost constraint at an earlier stage of the algorithm,
thus avoiding the vast amount of time dedicated to
compute closed sets that will not be used, or to gener-
ate thousands of rules that will be later on discarded
based on their low confidence boost.
REFERENCES
Balc
´
azar, J. L. (2010). Formal and computational properties
of the confidence boost in association rules. Available
at: [http://personales.unican.es/balcazarjl].
Balc
´
azar, J. L., T
ˆ
ırn
˘
auc
˘
a, C., and Zorrilla, M. (2010). Min-
ing educational data for patterns with negations and
high confidence boost. Accepted for TAMIDA’2010;
available at: [http://personales.unican.es/balcazarjl].
Borgelt, C. (2003). Efficient implementations of apriori and
eclat. In Goethals, B. and Zaki, M. J., editors, FIMI,
volume 90 of CEUR Workshop Proceedings. CEUR-
WS.org.
Boulicaut, J.-F., Bykowski, A., and Jeudy, B. (2000). To-
wards the tractable discovery of association rules with
negations. In FQAS, pages 425–434.
Clementine (2005). Clementine 10.0 desktop user guide.
Kryszkiewicz, M. (2005). Generalized disjunction-free rep-
resentation of frequent patterns with negation. J. Exp.
Theor. Artif. Intell., 17(1-2):63–82.
Kryszkiewicz, M. (2009). Non-derivable item set and non-
derivable literal set representations of patterns admit-
ting negation. In Pedersen, T. B., Mohania, M. K., and
Tjoa, A. M., editors, DaWaK, volume 5691 of LNCS,
pages 138–150. Springer.
KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval
268