Rules Implying class
0.33 < PL < 0.678
0.375 < PW < 0.792
Iris-versicolor
0.208 < SW < 0.542
0.627 < PL < 0.847
0.54 < PW < 1.0
Iris-virginica
0.778 < SL < 1.0
0.25 < SW < 0.75
0.814 < PL < 1.0
0.625 < PW < 0.917
Iris-virginica
0.0 < SL < 0.417
0.41 < SW < 0.917
0.0 < PL < 0.153
0.0 < PW < 0.208
Iris-setosa
Figure 5: Optimized initial rules extracted by CSOM
Notation: SL – sepal_length, SW – sepal_width, PL –
petal _length, PW – petal_width.
By applying the RO technique the rule set was
reduced to four rules as displayed in Figure 5.
However, not as many attributes were removed from
each of the rules and two instances were
misclassified. Hence, performing network pruning
and retraining prior to RO may achieve a more
optimal rule set. However, in the cases where
retraining the network may be too expensive the RO
technique can be applied by itself. In fact compared
to the initial set of rules detected by CSOM, which
consisted of nine rules with three misclassified
instances this is still a significant improvement.
The second set of experiments was performed on
the complex ‘Sonar’ dataset which consists of sixty
continuous attributes. The examples are classified
into two groups one identified as rocks (R) and the
second identified as metal cylinders (M). The
learned decision tree by the C4 algorithm (Quinlan,
1990) consisted of 18 rules with the predictive
accuracy equal to 65.1%. These rules were taken as
input in our RO technique and the MT was set to 0.2
while the MVT was set to 0.0005. The optimized rule
set consisted of only two rules i.e 0.0 < a11 <= 0.197
Æ R and 0.197 < a11 <= 1.0 Æ M. When tested on
an unseen dataset the predictive accuracy was 82.2
% i.e. 11 instances were misclassified from the
available 62. Hence the RO process has again proved
useful in simplifying the rules set without the cost of
increasing the number of misclassified instances.
5 CONCLUSIONS
This paper has presented a rule optimizing technique
motivated by the psychological studies of human
concept information. The capability to swap from
the higher level reasoning to the reasoning at the
lower instance level has indeed proven useful for
determining the relevance of attributes throughout
the rule optimizing process. The method is
applicable to the optimization of rules obtained from
any data mining techniques. The evaluation of the
method on the rules learned from real world data by
different classifier methods has shown its
effectiveness in optimizing the rule set. As a future
work method needs to be extended so that
categorical attributes can be handled as well.
Furthermore, it would be interesting to explore the
possibilities of the rule optimizing method in
becoming a stand-alone machine learning method
itself.
REFERENCES
Blake, C., Keogh, E. and Merz, C.J., 1998. UCI
Repository of Machine Learning Databases, Irvine,
CA: University of California, Department of
Information and Computer Science., 1998.
[http://www.ics.uci.edu/~mlearn/MLRepository.html].
Bruner, J.S., Goodnow, J.J., and Austin, G.A., 2001. A
study of thinking, John Wiley & Sons, Inc., New York,
1956.
Hadzic, F. & Dillon, T.S., 2005. “CSOM: Self Organizing
Map for Continuous Data”, 3
rd
Int’l IEEE Conf. on
Industrial Informatics, 10-12 August, Perth.
Hadzic, F. and Dillon, T.S., 2006 “Using the Symmetrical
Tau (τ ) Criterion for Feature Selection in Decision
Tree and Neural Network Learning”, 2
nd
Workshop on
Feature Selection for Data Mining: Interfacing
Machine Learning and Statistics, in conj. with SIAM
Int’l Conf. on Data Mining, Bethesda, 2006.
Hadzic, F. & Dillon, T.S., 2007. “CSOM for Mixed Data
Types”, 4
th
Int’l Symposium on Neural Networks, June
3-7, Nanjing, China.
Kristal, L., ed. 1981, ABC of Psychology, Michael Joseph,
London, pp. 56-57.
Pollio, H.R., 1974, The psychology of Symbolic Activity,
Addison-Wesley, Reading, Massachusetts.
Quinlan, J.R., 1990. “Probabilistic Decision Trees”,
Machine Learning: An Artificial Intelligence
Approach Volume 4, Kadratoff, Y & Michalski, R.,
Morgan Kaufmann Publishers, Inc., San Mateo,
California.
Roch, E. 1977, “Classification of real-world objects:
Origins and representations in cognition”, in Thinking:
Readings in Cognitive Science, eds P.N. Johnson-
Laird & P.C. Wason, Cambridge University Press,
Cambridge, pp. 212-222.
Sestito, S. and Dillon, S.T., 1994. Automated Knowledge
Acquisition. Prentice Hall of Australia Pty Ltd,
Sydney.
Zhou, X., and Dillon, T.S., 1991. “A statistical-heuristic
feature selection criterion for decision tree induction”,
IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 13, no.8, August, pp 834-841.
BIOSIGNALS 2008 - International Conference on Bio-inspired Systems and Signal Processing
36