measures for mining generalized association rules.
However, most of the works are focused in to
improve methods of to obtain generalized fuzzy
association rules, which are the ones composed by
linguistic terms, but few works have directed efforts
for improve the exploring of generalized rules under
fuzzy concept hierarchies, mainly in relation to the
stage that they are used.
Besides, some works, like (Miani et al., 2009)
and (Escovar et al., 2006), explore the semantic
enrichment through similarity relations. However,
these works do not consider that the degree of a
similarity relation, between two or more elements, it
is also related to the point of view or to the context
analysed. For example, consider the problem of
compare two vegetables, tomato and khaki, in
relation to two different points of view (contexts),
appearance and flavour. In respect to the appearance
context, would be possible to check that tomato is
very similar to khaki, with a very high degree of
similarity; but in relation to the flavour, would be
possible to check that both are bit similar, with a
minor degree of similarity.
Thus, this paper presents the Context FOntGAR
algorithm for mining generalized association rules,
using fuzzy ontologies composed by relationships of
specialization/generalization varying in the interval
[0,1], and similarity relations with different degrees
according to the context. The generalization can to
occur in all levels of fuzzy ontologies. The paper is
organized as follow: Section two shows some related
works. Section three presents the Context FOntGAR
algorithm. The section four presents the
experiments, and the section five shows the
conclusions.
2 BACKGROUND
Aiming to obtain general knowledge, the generalized
association rules, which are rules composed by items
contained in any level of a given taxonomy, were
introduced by (Srikant and Agrawal,1995). There
are many works using crisp taxonomic structures.
These works are distinguished, mainly, in function
of the stage (of the algorithm processing) in which
these structures are used.
In the pre-processing, the generalized rules are
obtained through extended databases, and these
bases are generated before the pattern generation.
Extended databases are the ones composed by
transactions containing items of the original
database and ancestors of the taxonomy. In the post-
processing the generalized rules are obtained after
the generation of the traditional rules, through a sub-
algorithm that uses some generalization
methodology based on the patterns generated.
In (Wu and Huang, 2011), the mining is made
using an efficient data structure. The goal is to use
the structure for find rules between items in different
levels of a taxonomy tree, under the assumption that
the original frequent itemsets and association rules
were generated in advance. Thus, the generalization
occurs during the post-processing step. In relation to
the post-processing, (Carvalho et al., 2007) proposed
the GARPA algorithm. The algorithm, unlike what
was proposed by (Srikant and Agrawal, 1995), do
not insert ancestor items in the database transactions.
The generalization was done using a method of
replacing rule items into taxonomy ancestors. From
the quantitative point of view, this process is more
advantageous than proposed by (Srikant and
Agrawal, 1995), because implies a smaller amount
of candidates, and consequently of rules generated,
dispensing the use of measures for pruning
redundant rules.
In mining generalized rules, most of the works
using fuzzy logic are mainly focused in to obtain
generalized fuzzy association rules, which are the
ones composed by fuzzy linguistic terms, such as
young, tall, and others. In such approaches are used
crisp taxonomies and the linguistic terms are
generated based on fuzzy intervals, normally
generated through clustering. Besides, these works
are directed to explore quantitative or categorical
attributes. In this context we can to point, for
example, the works (Hung-Pin et al., 2006),
(Mahmoudi et al., 2011), (Cai et al., 1998), (Hong et
al., 2003) and (Lee et al., 2008). On the other hand,
few works use fuzzy taxonomies in order to obtain
their rules. In this case, the focus is not the exploring
of patterns composed by linguistic terms, but it is
how to explore taxonomic structures composed by
different specialization/generalization degrees.
The problem of mining generalized rules using
fuzzy taxonomies was proposed by (Wei and Chen,
1999). They included the possibility of partial
relationship in taxonomies, i.e., while in crisp
taxonomies the specialization/generalization degrees
are 1, in fuzzy structures such degrees vary in the
interval [0,1]. So, the degree
which any node y
belongs to its ancestor x can be derived based upon
the notions of subclass, superclass and inheritance,
and may be calculated using the max-min product
combination. Specifically,
=max
∀: →
(min
∀
)
(1)
Where l: → is one of the paths of attributes
x and y, e on l is one of the edges on access l,
is
MiningGeneralizedAssociationRulesusingFuzzyOntologieswithContext-basedSimilarity
75