order to disfavor/favor
C
j
, given that Q
j
is medium,
then Q
j
needs to be moved towards δ
min
/δ
max
(increased/decreased with a value proportional to
Fav). The same concept is applied to rules 4 and 5.
Rules 6 and 7 are introduced to control the amount
of increment/decrement when
Q
j
is already high/low.
5 EXPERIMENTAL RESULTS
We tested the algorithm on a number of real-world
pattern classification problems. In the first
experiment, the following
kNN variants are
considered: The traditional k-nearest neighbor
classifier (kNN), the weighted k-nearest neighbor
classifier (w
kNN), the modified kNN according to
the proposed distance measure described in section 3
(m
kNN), mkNN with weighted neighbors (wmkNN),
the Fuzzy
kNN (FkNN), adopted from (Keller et al.),
the Evidential kNN based on the Dempster-Shafer
theory of evidence (Denoeux, 1995) (DSkNN), the
neighborhood component analysis (NCA)
(Goldberger et al.), and Tan’s class weighting
kNN
(CWkNN) described above.
Table 1: Dataset Description.
Name
# Patterns # Attributes
C1/C2 ratio
Pima 768 8 0.54
Hill 606 100 0.95
Cmc 844 9 0.65
Sonar 208 60 0.87
Mamm 814 5 0.93
Hearts 270 13 0.80
Btrans 748 4 0.31
Heart 267 22 0.26
Bands 351 30 0.60
Gcredit 1000 24 0.43
Teach 151 5 0.48
Wdbc 569 30 0.59
Acredit 690 14 0.80
Haber 306 3 0.36
Ion 351 34 0.56
In our experiments, we have used 16 datasets from
the UCI repository website (Newman, 2007), as
shown in Table 1. 80% of the patterns were used for
training and 20% for testing. For each method
several values of
k have been used, k = {3, 5, .., 15},
and the one that gave the best performance using a
cross-validation scheme was chosen. In order to
evaluate the performance of each method, class-wise
classification accuracy was used (
Ac
j
is the accuracy
of class C
j
). We also calculated the average
classification accuracy of the two classes
Acv=
(
Ac
1
+Ac
2
)/2. The Acv results of the eight kNN
variants are presented in Table 2. The table shows
that when
kNN and wkNN produce different
performance of the two classes, considerable
improvement can be achieved using the proposed
method (m
kNN and wmkNN). The mean of Acv over
all tested datasets show that both mkNN and
wmkNN can noticeably improve the accuracy of the
underrepresented class as well as
Acv.
It’s worth mentioning that CWkNN fails when
applied to certain datasets as it does not takes into
account the distances between neighbors of different
classes, i.e. the weights only depend on the number
of patterns that belong to each class. Additionally,
this method needs tuning of the exponent
α
. On the
other hand, CW
kNN performed slightly better than
the proposed method when applied to datasets that
have relatively small number of pattern, such as
Heart and HeartS, where in such case distances
between neighbors of different classes may not give
a good estimate of the weights.
In the second experiment, the issue of favoring a
particular class is considered by applying the FIS
explained in section 4, we referred to it as FIS
kNN,
to selected datasets from Table 1. The value of
Fav
was varied between 0.9 and 0.1, and the obtained
results are shown in Table 3. We can see that in all
of the examined datasets, FIS
kNN managed to adjust
the value of Q
j
, such that quite a high classification
accuracy of the desired class is achieved. This, of
course, comes with the expense of reducing the
accuracy of the other class, where the higher the
accuracy of one class, the lower the accuracy of the
other. As explained earlier, this represents an
additional option given to the user in case that he/she
wants to give more emphasis to a particular class. It
is worth mentioning that the highest value of the
mean of
Acv is achieved around Fav = 0.5, which is
basically the wm
kNN described in section 3. This is
also the value that produces the minimum difference
between the mean of
Ac
1
and that of Ac
2
, i.e., the
best compromise between sensitivity and specificity.
Fig. 4 shows the ROC curves of the different
classifiers for the Mamm, Bands and Pima datasets.
As the traditional
kNN and its variants do not give
the option of favoring a particular class, the curves
are drawn using three points only, {0,0}, {1,1} and
the average class-wise accuracy of those classifiers.
The proposed method on the other hand has the
ability to construct the full curve and it clearly
shows the behavior of the classifier. Those curves
will be quite beneficial if the user would like to
know the tradeoff of favoring a particular class. The
graphs also show that the proposed algorithm
A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES
411