LEARNING METHOD UTILIZING SINGULAR REGION
OF MULTILAYER PERCEPTRON
Ryohei Nakano, Seiya Satoh and Takayuki Ohwaki
Department of Computer Science, Chubu University, 1200 Matsumoto-cho, Kasugai 487-8501, Japan
Keywords:
Multilayer perceptron, Singular region, Learning method, Polynomial network, XOR problem.
Abstract:
In a search space of multilayer perceptron having J hidden units, MLP(J), there exists a singular flat region
created by the projection of the optimal solution of MLP(J−1). Since such a singular region causes serious
slowdown for learning methods, a method for avoiding the region has been aspired. However, such avoiding
does not guarantee the quality of the final solution. This paper proposes a new learning method which does
not avoid but makes good use of singular regions to find a solution good enough for MLP(J). The potential of
the method is shown by our experiments using artificial data sets, XOR problem, and a real data set.
1 INTRODUCTION
It is known in MLP learning that an MLP(J) param-
eter subspace having the same input-output map as
an optimal solution of MLP(J−1) forms a singular
region, and such a singular flat region causes stagna-
tion of learning (Fukumizu and Amari, 2000). Natural
gradient (Amari, 1998; Amari et al., 2000) was once
proposed to avoid such stagnation of MLP learning,
but even the method may get stuck in singular regions
and is not guaranteed to find a good enough solution.
Recently an alternative constructive method has been
proposed (Minnett, 2011).
It is also known that many useful statistical mod-
els, such as MLP, Gaussian mixtures, and HMM, are
singular models having singular regions where pa-
rameters are nonidentifiable. While theoretical re-
search has been vigorously done to clarify mathemat-
ical structure and characteristics of singular models
(Watanabe, 2009), experimental and algorithmic re-
search is rather insufficient to fully support the theo-
ries.
In MLP parameter space there are many local
minima forming equivalence class (Sussmann, 1992).
Even if we exclude equivalence class, it is widely be-
lieved that there still remain local minima (Duda et al.,
2001). When we adopt an exponential function as an
activation function in MLP (Nakano and Saito, 2002),
there surely exist local minima due to the expressive
power of polynomials. In XOR problem, however, it
was proved there is no local minima (Hamey, 1998).
Thus, since we have had no clear knowledge of MLP
parameter space, we usually run a learning method
repeatedly changing initial weights to find a good
enough solution.
This paper proposes a new learning method which
does not avoid but makes good use of singular regions
to find a good enough solution. The method starts
with MLP having one hidden unit and then gradu-
ally increases the number of hidden units until the
intended number. When it increases the number of
hidden units from J−1 to J, it utilizes an optimum
of MLP(J−1) to form the singular region in MLP(J)
parameter space. The singular region forms a line,
and the learning method can descend in the MLP(J)
parameter space since points along the line are sad-
dles. Thus, we can always find a solution of MLP(J)
better than the local minimum of MLP(J−1). Our
method is evaluated by the experiments for sigmoidal
or polynomial-type MLPs using artificial data sets,
XOR problem and a real data set.
Section 2 describes how sigular regions of MLP
can be constructed. Section 3 explains the proposed
method, and Section 4 shows how the method worked
in our experiments.
2 SINGULAR REGION OF
MULTILAYER PERCEPTRON
This section explains how an optimum of MLP(J−1)
can be used to form the singular region in MLP(J)
parameter space (Fukumizu and Amari, 2000). This
106
Nakano R., Satoh S. and Ohwaki T..
LEARNING METHOD UTILIZING SINGULAR REGION OF MULTILAYER PERCEPTRON.
DOI: 10.5220/0003652501060111
In Proceedings of the International Conference on Neural Computation Theory and Applications (NCTA-2011), pages 106-111
ISBN: 978-989-8425-84-3
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)