5.5 Clusters in relation to a selected
cluster
With the same technique used in the Section 5.4, we
introduce the two following views :
5.5.1 Union of all the clusters in relation to the
current cluster
Based on the ten best rules in the current cluster, we
draw the parallel coordinates of each rule on other
clusters. The user can see the zone that is the most
interesting with the highest value. The effect of the
ten best rules on the other clusters gives the user a
general sampling of the entire cluster. With the union
approach, many best rules may be presented and com-
pared.
5.5.2 Intersection of all the clusters in relation to
the current cluster
By decreasing the quantity of best rules in one clus-
ter, we will observe the rank distribution. The inter-
section is less interesting than the union because we
generally do not have any interesting zone. Using the
intersection in relation to the current cluster is impor-
tant when the user just finds a small set of interesting
and close rules.
6 CONCLUSION
This work has led to the implementation of a tool em-
bedding more than 8000 lines of Java codes for the
analysis of data set characteristics, quality measure
sensitivity, correlations, clustering and ranking.
We have found and identified eleven clusters from
all the quality measures studied on the Mushroom
data set. We have also proposed a way to study the
ten best rules of each cluster, the union and intersec-
tion of the ten best rules of all the cluster in relation to
the current cluster. The union of the ten best rules for
all the clusters is also presented for the user’s choice.
For the first presentation of our results, we just use
thirty four measures for implementation.
Our future research will focus on improving the
clustering of measures: (1) in designing a better sim-
ilarity measure than the linear correlation, (2) in se-
lecting the best representative measure in a cluster.
REFERENCES
Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining
association rules between sets of items in large data-
bases. In Proceedings of 1993 ACM-SIGMOD Inter-
national Conference on Management of Data.
Agrawal, R. and Srikant, R. (1994). Fast algorithms for
mining association rules. In Proceedings of the 20th
VLDB Conference.
Bayardo, J. and Agrawal, R. (1999). Mining the most in-
terestingness rules. In Proceedings of the Fifth ACM
SIGKDD International Confeference On Knowledge
Discovery and Data Mining.
Blake, C. and Merz, C. (1998). UCI Repos-
itory of machine learning databases,
http://www.ics.uci.edu/∼mlearn/MLRepository.html.
University of California, Irvine, Dept. of Information
and Computer Sciences.
Blanchard, J., Kuntz, P., Guillet, F., and Gras, R. (2003).
Implication intensity: from the basic statistical defin-
ition to the entropic version. In Statistical Data Min-
ing and Knowledge Discovery. Chapman & Hall, CRC
Press.
Brin, S., Motwani, R., and Silverstein, C. (1997). Beyond
market baskets: Generalizing association rules to cor-
relation. In Proceedings of ACM SIGMOD Confer-
ence.
Freitas, A. (1999). On rule interestingness measures. In
Knowledge-Based Systems, 12(5-6).
Gavrilov, M., Anguelov, D., Indyk, P., and Motwani, R.
(2000). Mining the stock market: which measure is
best? In Proceedings of the Sixth International Con-
ference on Knowledge Discovery and Data Mining.
Gras, R. (1996). L’implication statistique - Nouvelle
m
´
ethode exploratoire de donn
´
ees. La pens
´
ee sauvage
´
edition.
Hilderman, R. and Hamilton, H. (2001). Knowledge Dis-
covery and Measures of Interestingness. Kluwer Aca-
demic Publishers.
Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H.,
and Verkano, A. I. (1994). Finding interesting rules
from larges sets of discovered association rules. In the
Third International Conference on Information and
Knowledge Management. ACM Press.
Kononenco, I. (1995). On biases in estimating multi-valued
attributes. In Proceedings of the Fourteenth Interna-
tional Joint Conference on Artificial Intelligence (IJ-
CAI’95).
Liu, B., Hsu, W., Mun, L., and Lee, H. (1999). Finding
interestingness patterns using user expectations. In
IEEE Transactions on Knowledge and Data Mining
11(1999).
Padmanabhan, B. and Tuzhilin, A. (1998). A belief-
driven method for discovering unexpected patterns.
In Proceedings of the 4th international conference on
Knowledge Discovery and Data Mining.
Piatetsky-Shapiro, G. (1991). Discovery, analysis and pre-
sentation of strong rules. In Knowledge Discovery in
Databases. MIT Press.
Saporta, G. (1990). Probabilit
´
es, Analyse des Donn
´
ees et
Statistique. Editions Technip, Paris.
ICEIS 2005 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS
252