and 10.7 times larger than the smallest respectively.
For Parkinsons telemonitoring dataset, RBF-
SSF(pH) and RBF-SSF achieved the smallest, then
BFGS, newrb, and SD followed in ascending order,
having 1.17, 1.34, and 1.98 times larger test MSEs
than the smallest respectively.
RBF-SSF(pH) and RBF-SSF obtained much the
same smallest test MSEs. Although a model having
too small training error often has rather poor test
error, suffering from overfitting, RBF-SSFs found
solutions having minimum training and test errors.
We think this is caused by the following; RBF-SSFs
have the strong capability to find excellent solutions,
and RBF networks are less prone to overfitting. We
do not think this is caused by small-noise datasets
because Parkinsons telemonitoring dataset has mini-
mum training MSE 0.25 and minimum test MSE 0.38
for normalized data.
(3) Total Processing Time
As for total processing time for Schwefel func-
tion dataset, newrb was the fastest since it just solved
linear regressions. Among the other four methods,
BFGS was the fastest, and then, SD, RBF-SSF(pH),
and RBF-SSF followed in order, requiring 1.02, 1.05,
and 1.34 times longer time than BFGS respectively.
For Parkinsons telemonitoring dataset, newrb was
the fastest, and among the other four, SD was the
fastest, and then, BFGS, RBF-SSF(pH), and RBF-
SSF followed in order, requiring 1.01, 1.19, and 4.26
times longer time than SD respectively.
The proposed RBF-SSF(pH) was 1.28 and 3.57
times faster than the original RBF-SSF, and was
1.05 and 1.18 times slower than BFGS. We can say
RBF-SSF(pH) runs almost as fast as a good learning
method BFGS. Moreover, to find excellent solutions
of RBF networks, some amount of processing time
will be needed.
5 CONCLUSIONS
Recently a very powerful one-stage learning method
RBF-SSF has been proposed to find excellent solu-
tions of RBF networks; however, it required a lot of
time mainly because it computes the Hessian. This
paper proposes a faster version of RBF-SSF called
RBF-SSF(pH) by introducing partial calculation of
the Hessian. The experiments using two datasets
showed RBF-SSF(pH) ran as fast as usual one-stage
learning methods while keeping the excellent solution
quality. In the future, we plan to apply RBF-SSF(pH)
to more data to prove its superiority.
ACKNOWLEDGMENT
This work was supported by Grants-in-Aid for Scien-
tific Research (C) 16K00342.
REFERENCES
Bishop, C. M. (1995). Neural networks for pattern recog-
nition. Oxford university press.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977).
Maximum likelihood from incomplete data via the
EM algorithm. Journal of the royal statistical soci-
ety. Series B (methodological), 39:1–38.
Dheeru, D. and Taniskidou, E. K. (2017). UCI machine
learning repository.
Fletcher, R. (1987). Practical methods of optimization, 2nd
edition. JOHN WILEY & SONS.
Fukumizu, K. and Amari, S. (2000). Local minima and
plateaus in hierarchical structures of multilayer per-
ceptrons. Neural Networks, 13(3):317–327.
L´azaro, M., Santamarıa, I., and Pantale´on, C. (2003). A
new EM-based training algorithm for RBF networks.
Neural Networks, 16(1):69–77.
Little, M. A., McSharry, P. E., Roberts, S. J., Costello,
D. A., and Moroz, I. M. (2007). Exploiting nonlin-
ear recurrence and fractal scaling properties for voice
disorder detection. Biomedical Engineering Online,
6(1):23.
Nitta, T. (2013). Local minima in hierarchical structures of
complex-valued neural networks. Neural Networks,
43:1–7.
Satoh, S. and Nakano, R. (2013). Multilayer perceptron
learning utilizing singular regions and search pruning.
In Proc. Int. Conf. on Machine Learning and Data
Analysis, pages 790–795.
Satoh, S. and Nakano, R. (2015). A yet faster version
of complex-valued multilayer perceptron learning us-
ing singular regions and search pruning. In Proc.
of 7th Int. Joint Conf. on Computational Intelligence
(IJCCI), volume 3 NCTA, pages 122–129.
Satoh, S. and Nakano, R. (2018). A new method for learn-
ing RBF networks by utilizing singular regions. In
Proc. 17th Int. Conf. on Artificial Intelligence and Soft
Computing (ICAISC), pages 214–225.
Schwefel, H.-P. (1981). Numerical Optimization of Com-
puter Models. John Wiley & Sons, Inc.
Schwenker, F., Kestler, H. A., and Palm, G. (2001). Three
learning phases for radial-basis-function networks.
Neural networks, 14(4-5):439–458.
Wu, Y., Wang, H., Zhang, B., and Du, K.-L. (2012). Using
radial basis function networks for function approxi-
mation and classification. ISRN Applied Mathematics,
2012:1–34.