skin cells in vivo. This technique also increases the
accuracy of the experts’ diagnosis but even in the
hands of experts and in combination with dermoscopy
information, accuracy never reaches 100%.
Thus, we are especially interested on character-
izing skin lesions in the frontier of both malignant
and benignant lesions. In our experiments we used
descriptions of skins lesions that have already been
excised, i.e., they are lesions that dermatologists con-
sidered that could be malignant melanoma. However
some of them, after a histopathology analysis resulted
to be benignant. This means that they provide a good
set of suspicious lesions from which to generate a do-
main model able to discriminate between both malig-
nant and benignant lesions with similar characteris-
tics. We propose to take descriptions of known skin
lesions and to use a lazy learning method to obtain a
domain theory. Skin lesions are described using two
sets of features, dermatoscopic and confocal, and our
goal is to find a subset of features characterizing ma-
lignant lesions.
There are several works that automatically diag-
nose Malignant Melanoma. MELAFIND (medgadget
.com/archives/2005/08/melafind
system.html) is a de-
vice designed to determine whether skin moles and le-
sions are malignant. It uses a database of around 6000
already biopsied lesions to find similarities with a new
potentially malignant skin lesion. In (Vestergaard and
Menzies, 2008) there is an interesting comparison
of the performance of several automatic instruments
with human experts. The main conclusion of this
comparison is that there is not an automatic method
clearly outperforming human performace. All these
automatic instruments have a different goal than our
approach since they want to take the role of dermatol-
ogists and analyze and interpret an image of an skin
lesion in order to diagnose it. In our work, the goal is
not to diagnose from an image but from the interpreta-
tion of an image given by a dermatologist. In fact we
do not want to take the dematologist’s role but support
them in diagnosing a skin lesion.
In the present paper we introduce a classification
system that using lazy learning methods, is able to
recognize MM from similar benignant skin lesions.
The main goal is to minimize the number of MM di-
agnosed as benignant and to maximize the number
of MM correctly diagnosed, although we have to ac-
cept a reasonable number of false positives. In other
words, we want primarily to achieve a high sensitivity
and secondly we try to obtain a specificity as high as
possible.
In domains as the current one, it could be spe-
cially useful, in addition to classify a new problem, to
generate also some kind of explanation of the domain
model. Usual domain models are automatically build
using inductive learning methods (Mitchell, 1997)
that generalize the input data to generate a model
(or domain theory) that can be useful in the future
to classify unseen data. Inductive learning methods
can produce overgeneralization when solution classes
are not clearly separated. This means that, although
the model fits the known data, it fails in the classi-
fication of unseen objects. An example is the do-
main of predictive toxicology where from the descrip-
tion of carcinogen and non-carcinogen chemical com-
pounds, the goal is to find a model for carcinogene-
sis (Helma and Kramer, 2003). The difficulty in that
domain is that there are chemical compounds with
a very similar chemical structure with different car-
cinogenic activity. A similar situation occurs in the
characterization of skin lesions since early malignant
melanoma can share many characteristics with benig-
nant lesions and, therefore a dermatologist can easily
confuse them.
A different approach for classifying unseen exam-
ples is to use some lazy learning method (instance-
based, case-based reasoning, etc.). Thus, a new prob-
lem is classified as belonging to a class by assessing
its similarity with a set of known examples. Lazy
learning methods are good classifiers but they do not
produce explicit generalizations and therefore no do-
main knowledge can be build from them. Currently
there is a growing research line that focuses on ex-
plaining the result of lazy learning methods (see for
instance (Roth-Berghofer, 2004; Plaza et al., 2005)
and proceedings of the workshops on Explanation-
aware Computing held from 2004). In (Armengol,
2008) we pointed out that if we could generate some
explicit generalization of the classification process
from a lazy learning method, we could generate a do-
main theory. These generalizations could be seen as
local approximations and, by storing them, we should
have a model of the domain. Notice that this domain
theory is not complete, since it only describes some
areas of the problem space (those around the prob-
lems already solved). Consequently, explanations of
a lazy learning method could be used for knowledge
discovery. In some sense, this is the same idea of
explanation-based learning methods (Mitchell et al.,
1986) that generate domain rules from one example.
We experimented with two lazy learning methods:
the well known k-NN method and the LID method
(Armengol and Plaza, 2001). We compare their pre-
dictivity results with those produced using a decision
tree, and we show that the lazy methods have a better
performace than the decision tree also in terms of sen-
sitivity and specificity. From the experiments we also
constructed a domain theory that has been very use-
COMBINING TWO LAZY LEARNING METHODS FOR CLASSIFICATION AND KNOWLEDGE DISCOVERY - A
Case Study for Malignant Melanoma Diagnosis
201