Table 2: Experimental results for UCI machine learning repository. Hyperparameters were tuned with only training data
through cross validation. “Ins.” denotes the number of instances after removing missing data. “Att.” denotes the number
of attributes excluding class attribute, and “#” denotes the number of support vectors or prototypes. The proposed method,
referred to as ILM, was evaluated with different parameter sets: some parameters were fixed as (a)C = 0, p = −2
8
, (b)C = 0,
(c) p = −1, and the other parameters were varied like in Table 1.
SVM ILM
Dataset Ins. Att. (a) (b) (c)
Error # Error # Error # Error #
(%) (%) (%) (%)
BreastCancer 683 9 2.8 66 2.5 32 2.6 64 2.6 64
Cards 653 15 13.6 189 13.2 8 13.9 8 14.6 4
Heart1 297 13 14.5 159 16.5 2 18.2 64 14.8 2
HouseVotes84 435 16 3.7 101 3.7 2 3.9 16 3.7 32
Ionosphere 351 34 4.8 148 8.3 8 3.7 32 4.3 64
Liver 345 6 28.1 203 28.1 4 27.5 4 26.3 4
P.I. Diabetes 768 8 22.3 418 22.5 2 22.3 32 21.9 64
Sonar 208 60 12.1 125 11.1 16 9.6 8 11.5 16
Tictactoe 958 9 1.7 349 1.7 2 0.9 16 1.7 2
conventional nearest neighbor rule. It was shown that
prototypes and biases can be trained simultaneously
by General Loss Minimization, which is a general
framework for classifier design. The preliminary ex-
periments raised the possibility that prototypes having
larger biases can be removed without degrading per-
formance, but this redundancy removal should be in-
vestigated further. Experimental results for UCI ma-
chine learning repository revealed that the proposed
method achieves almost the same as or higher clas-
sification accuracy than SVM for all of nine datasets
with a much fewer prototypes than support vectors.
In future, the proposed method will be evaluated for
various classification problems in real world.
REFERENCES
Blake, C. and Merz, C. (1998). UCI repository of machine
learning databases. University of California, Irvine,
Dept. of Information and Computer Sciences.
Cortes, C. and Vapnik, V. (1995). Support vector networks.
Machine Learning, 20:273–297.
Crammer, K., Gilad-Bachrach, R., Navot, A., and Tishby,
N. (2003). Margin analysis of the lvq algorithm. In
Advances in Neural Information Processing Systems,
volume 15, pages 462–469. MIT Press.
Giraud, B. G., Lapedes, A. S., Liu, L. C., and Lemm, J. C.
(1995). Lorentzian neural nets. Neural Networks,
8(5):757–767.
Grbovic, M. and Vucetic, S. (2009). Learning vector
quantization with adaptive prototype addition and re-
moval. In International Conference on Neural Net-
works, pages 994–1001.
Hammer, B. and Villmann, T. (2002). Generalized rele-
vance learning vector quantization. Neural Networks,
15(8–9):1059–1068.
Karayiannis, N. (1996). Weighted fuzzy learning vec-
tor quantization and weighted generalized fuzzy c-
means algorithm. In IEEE International Conference
on Fuzzy Systems, pages 773–779.
Kohonen, T. (1995). Self-Organizing Maps. Springer-
Verlag.
Meyer, D., Leisch, F., and Hornik, K. (2003). The support
vector machine under test. Neurocomputing, 55(1–
2):169–186.
Qin, A. K. and Suganthan, P. N. (2004). A novel ker-
nel prototype-based learning algorithm. In Inter-
national Conference on Pattern Recognition (ICPR),
pages 621–624.
Sato, A. (1998). A formulation of learning vector quantiza-
tion using a new misclassification measure. In the 14th
International Conference on Pattern Recognition, vol-
ume 1, pages 322–325.
Sato, A. (2010). A new learning formulation for kernel clas-
sifier design. In International Conference on Pattern
Recognition (ICPR), pages 2897–2900.
Sato, A. and Yamada, K. (1996). Generalized learning vec-
tor quantization. In Advances in Neural Information
Processing Systems, volume 8, pages 423–429. MIT
Press.
Schneider, P., Biehl, M., and Hammer, B. (2009). Adap-
tive relevance matrices in learning vector quantiza-
tion. Neural Computation, 21(12):3532–3561.
Villmann, T. and Haase, S. (2011). Divergence-based vector
quantization. Neural Computation, 23(5):1343–1392.
ICPRAM2013-InternationalConferenceonPatternRecognitionApplicationsandMethods
158