PROBABILISTIC ESTIMATION OF VAPNIK-CHERVONENKIS
DIMENSION
Przemysław Kle¸sk
Department of Methods of Artificial Intelligence and Applied Mathematics, West Pomeranian University of Technology
ul.
˙
Zolnierska 49, Szczecin, Poland
Keywords:
Statistical learning theory, Machine-learning, Vapnik-Chervonenkis dimension, Binary classification.
Abstract:
We present an idea of probabilistic estimation of Vapnik-Chervonenkis dimension given a set of indicator
functions. The idea is embedded in two algorithms we propose — named A and A
′
. Both algorithms are based
on an approach that can be described as expand or divide and conquer. Also, algorithms are parametrized
by probabilistic constraints expressed in a form of (ε,δ)-precision. The precision implies how often and by
how much the estimate can deviate from the true VC-dimension. Analysis of convergence and computational
complexity for proposed algorithms is also presented.
1 INTRODUCTION
Vapnik-Chervonenkis dimension is an important no-
tion within Statistical Learning Theory (Vapnik and
Chervonenkis, 1968; Vapnik and Chervonenkis,
1989; Vapnik, 1995; Vapnik, 1998). Many bounds
on generalization or sample complexity are based on
it.
Recently, several other measures of functions sets
capacity (richness) have been under study. Particu-
larly, of great interest are covering numbers (Bartlett
et al., 1997; Anthony and Bartlett, 2009). In many
cases covering numbers can lead to tighter bounds
(on generalization or sample complexity) than pes-
simistic bounds based on VC-dimension. However,
the constructive derivation of covering numbers itself
is usually a challenge. One has to suitably take ad-
vantage of some properties of given set of functions
or of the learning algorithm and discover how they
translate onto a cover. One of such attractive results is
e.g. a result from (Zhang, 2002) related to regulariza-
tion. Qualitatively, it states that for sets of functions
linear in parameters and under a L
q
-regularization
(general q = 1,2,...) the bound on covering number
scales only linearly with the dimension of input do-
main. This allows to learn and generalize well with
a sample complexity logarithmic in the number of at-
tributes. On the other hand, there exist results where
the property used for the derivation of covering num-
bers is actually the known VC-dimension of some
set of functions (Anthony and Bartlett, 2009), which
again proves its usefulness.
Known are some sets of functions for which the
exact value of VC-dimension has been established
by suitable combinatorial or geometric proofs (of-
ten very complex). Here are some examples. For
polynomials defined over R
d
of degree at most n,
the VC-dim is
n+d
d
, see e.g. (Anthony and Bartlett,
2009). For hyperplanes in R
d
(which can be bases
for multilayer perceptrons) the VC-dim is d + 1 (Vap-
nik, 1998). For rectangles in R
d
the VC-dim is 2d
(Cherkassky and Mulier, 1998). For spheres in R
d
(which can be bases of RBF neural networks) the
VC-dim is d + 1 (Cherkassky and Mulier, 1998). As
regards linear combinations of bases as above the
VC-dim can typically be bounded by the number of
bases times the VC-dim of a single base (Anthony and
Bartlett, 2009, p. 154), this fact however requires usu-
ally a careful analysis.
Also, some analysis has been done in the subject
of computational complexity for the VC-dimension.
In particular, in (Papadimitriou and Yannakakis,
1996) authors take up the following problem ,,given
a set of functions F and a natural number k, is
VC-dim(F) ≥ k?”, i.e. one asks about a lower bound
of VC-dimension. And the problem is proved to be
logNP-complete.
Our motivation for this paper is to introduce an
idea for algorithms, which given an arbitrary set of
functions (plus a learning algorithm) would be able
to estimate its VC-dimension with an imposed proba-
bilistic accuracy. Such algorithms, if sufficiently suc-
262
Klesk P..
PROBABILISTIC ESTIMATION OF VAPNIK-CHERVONENKIS DIMENSION.
DOI: 10.5220/0003721702620270
In Proceedings of the 4th International Conference on Agents and Artificial Intelligence (ICAART-2012), pages 262-270
ISBN: 978-989-8425-95-9
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)