Here we will investigate the use of our CBR ap-
proach by CCLAM to classify enterprises in the above
described sense. In the following we describe a sum-
mary of our experiment and its conclusions, being im-
possible to include here a detailed description of them
because the size and the amount of data and algorith-
mic steps.
Te interested reader may consult
http://decsai.ugr.es/ to obtain the whole set of data
and the complete discussion about the experiment.
The training set contains information about 1500
companies each described by 53 attributes. That is in
our case n = 53 and T = 1500.
The variables have very different semantics and ac-
cording to that different domains and scales. In fact
we have:
• 36 continuous attributes (ex. debt level)
• 9 integer attributes (ex. number of employees)
• 6 boolean attributes (ex. whether the company has
been audited)
• 2 categorial attributes (ex. economic activity code)
To take into account the different distance functions
(or similarity functions) needed to compare different
kinds of attributes we must configure the P-Memory
so that there will be different Ψ
C
and Υ
C
functions
depending on the kind of attribute that represents each
processing element.
We don’t know the degree of influence of each at-
tribute in the health of a company. To estimate the
relative importance of the attributes we measure the
goodness of a set of weights as the amount of correct
classifications of cases in a cross validation process.
We assume that similar companies have the same
classification (this assumption is the key of case based
reasoning). We also assume that the number of known
cases is high enough. If the weights were right then
when we extract a case from the memory and use it as
a new problem there will be similar problems that will
classify it correctly. We repeat this operation with all
the cases and the percentage of correct classifications
is the value that measures the goodness of the set of
weights.
To find an appropiate set of weights for the train-
ing set of cases we have used genetic algorithms to
find initial good solutions that were then refined us-
ing simulated annealing.
Once configured our CCLAM based CBR system
with the appropiate weights it was tested with a set of
764 new cases to measure the number of correct clas-
sifications obtaining a 73% of success that represents
a good result that justifies the use of the system.
5 CONCLUSIONS
• The CCLAM is a good choice for case based rea-
soning because it can store any number of arbitrary
cases and it don’t recover cases not stored previ-
ously because of the absence of spurious states.
• The use of two CCLAM to store the problems and
the solutions in different memories allows the cor-
rect computation ofthe similarity of the problems
and the correct retrieval of the suggested solution.
• The L-Memory allows the use of a voting system
to select the most voted of the solutions proposed
by the stored cases.
• The similarity between problems is computed by
means of the generalized minkowsky distance with
the correct configuration of the P-Memory.
• When the solution is expressed with attributes that
can be aggregated we can obtain a lineal combina-
tion of the sugested solutions.
REFERENCES
Aamodt, A. and Plaza, E. (1994). Case-based reasoning:
Foundational issues, methodological variations and
system approaches. AI Communications, 7(1):39–59.
Aha, D. W. (1998). The omnipresence of case-based rea-
soning in science and application. Knowledge-Based
Systems, 11(5-6):261–273.
Aha, D. W., Kibler, D., and Albert, M. K. (1991). Instance-
based learning algorithms. Machine Learning, (6):37–
66.
Bailón, A., Delgado, M., and Fajardo, W. (2002). Con-
tinuous classifying associative memory. International
Journal of Intelligent Systems, 17(4):391–407.
Bartsch-Sprl, B., Lenz, M., and Hbner, A. Case-based rea-
soning – survey and future directions.
Globig, C. and Wess, S. (1995). Learning in case-
based classification algorithms. In Jantke, K. P.
and Lange, S., editors, Algorithmic Learning for
Knowledge-Based Systems, volume 961, pages 340–
362. Springer-Verlag.
Leake, D. B. (1996). Case-Based Reasoning: Experiences,
Lessons, and Future Directions. MIT Press.
Richter, M. M. (1992). Classification and learning of simi-
larity measures. Technical Report SR-92-18.
Wilson, D. R. and Martinez, T. R. (1997). Instance pruning
techniques. In Proc. 14th International Conference
on Machine Learning, pages 403–411. Morgan Kauf-
mann.
ICEIS 2004 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS
374