competitive AFs are found. This is remarkable as
only very basic candidate solutions are provided (in
contrast to, e.g., Swish). Moreover, our approach is
adapting very well to different kinds of problems.
REFERENCES
Clevert, D., Unterthiner, T., and Hochreiter, S. (2016). Fast
and accurate deep network learning by exponential li-
near units (ELUs). In Proc. Int’l Conf. on Learning
Representations.
De Jong, K. A. (2006). Evolutionary Computation: A Uni-
fied Approach. MIT press.
Elfwing, S., Uchibe, E., and Doya, K. (2018). Sigmoid-
weighted linear units for neural network function ap-
proximation in reinforcement learning. Neural Net-
works, 107:3–11.
Glorot, X., Bordes, A., and Bengio, Y. (2011). Deep sparse
rectifier neural networks. In Pro. Int’l Conf. on Artifi-
cial Intelligence and Statistics.
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep
Learning. MIT Press.
Gulcehre, C., Moczulski, M., Denil, M., and Bengio, Y.
(2016). Noisy activation functions. In Proc. Int’l Conf.
on Machine Learning.
Hagg, A., Mensing, M., and Asteroth, A. (2017). Evolving
parsimonious networks by mixing activation functi-
ons. In Proc. Genetic and Evolutionary Computation
Conf.
Hancock, P. J. B. (1992). Pruning neural nets by genetic
algorithm. In Proc. Int’l Conf. on Artificial Neural
Networks.
Hayou, S., Doucet, A., and Rousseau, J. (2018). On the
selection of initialization and activation function for
deep neural networks. arXiv:1805.08266.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving deep
into rectifiers: Surpassing human-level performance
on imagenet classification. In Proc. IEEE Int’l Conf.
on Computer Vision.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Identity
mappings in deep residual networks. In Proc. Euro-
pean Conf. on Computer Vision.
Hochreiter, S. (1998). The vanishing gradient problem du-
ring learning recurrent neural nets and problem so-
lutions. Int’l Journal of Uncertainty, Fuzziness and
Knowledge-Based System, 6(2):107–116.
Hornik, K. (1991). Approximation capabilities of mul-
tilayer feedforward networks. Neural Networks,
4(2):251–257.
Igel, C. (2003). Neuroevolution for reinforcement learning
using evolution strategies. In Proc. IEEE Congress on
Evolutionary Computation.
Klambauer, G., Unterthiner, T., Mayr, A., and Hochreiter,
S. (2017). Self-normalizing neural networks. In Ad-
vances on Neural Information Processing Systems.
Laurent, C., Pereyra, G., Brakel, P., Zhang, Y., and Ben-
gio, Y. (2016). Batch normalized recurrent neural net-
works. In Proc. IEEE Int’l Conf. on Acoustics, Speech
and Signal Processing.
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep lear-
ning. Nature, 521:436–444.
Liu, H., Simonyan, K., Vinyals, O., Fernando, C., and Ka-
vukcuoglu, K. (2018). Hierarchical representations
for efficient architecture search. In Proc. Int’l Conf.
on Learning Representations.
Loshchilov, I. and Hutter, F. (2016). CMA-ES for hyper-
parameter optimization of deep neural networks. In
Proc. Int’l Conf. on Learning Representations (Works-
hop track).
Maas, A. L., Hannun, A. Y., and Ng, A. Y. (2013). Rectifier
nonlinearities improve neural network acoustic mo-
dels. In Proc. ICML Workshop on Deep Learning for
Audio, Speech and Language Processing.
Miikkulainen, R., Liang, J. Z., Meyerson, E., Rawal, A.,
Fink, D., Francon, O., Raju, B., Shahrzad, H., Navru-
zyan, A., Duffy, N., and Hodjat, B. (2017). Evolving
deep neural networks. CoRR, abs/1703.00548.
Mishkin, D. and Matas, J. (2017). All you need is a good
init. In Proc. Int’l Conf. on Learning Representations.
Mitchell, M. (1996). An Introduction to Genetic Algo-
rithms. MIT Press.
Montana, D. J. and Davis, L. (1989). Training feedforward
neural networks using genetic algorithms. In Proc.
Int’l Joint Conf. on Artificial Intelligence.
Nair, V. and Hinton, G. E. (2010). Rectified linear units
improve restricted boltzmann machines. In Proc. Int’l
Conf. on Machine Learning.
Pennington, J., Schoenholz, S. S., and Ganguli, S. (2018).
The emergence of spectral universality in deep net-
works. arXiv:1802.09979.
Qiang, X., Cheng, G., and Wang, Z. (2010). An overview of
some classical growing neural networks and new de-
velopments. In Proc. Int’l Conf. on Education Techno-
logy and Computer.
Ramachandran, P., Zoph, B., and V. Le, Q. (2018). Sear-
ching for activation functions. In Proc. Int’l Conf. on
Learning Representations (Workshop track).
Schaffer, J. D., Whitley, D., and Eshelman, L. J. (1992).
Combinations of genetic algorithms and neural net-
works: A survey of the state of the art. In Proc. Int’l
Workshop on Combinations of Genetic Algorithms
and Neural Networks.
Simonyan, K. and Zisserman, A. (2015). Very deep convo-
lutional networks for large-scale image recognition. In
Proc. Int’l Conf. on Learning Representations.
Stanley, K. O. and Miikkulainen, R. (2002). Evolving neu-
ral networks through augmenting topologies. Evoluti-
onary Computation, 10(2):99–127.
Suganuma, M., Shirakawa, S., and Nagao, T. (2017). A
genetic programming approach to designing convolu-
tional neural network architectures. In Proc. Genetic
and Evolutionary Computation Conference.
Sutskever, I., Martens, J., Dahl, G., and Hinton, G. (2013).
On the importance of initialization and momentum in
deep learning. In Proc. Int’l Conf. on Machine Lear-
ning.
Whitley, D. (2001). An overview of evolutionary algo-
rithms: Practical issues and common pitfalls. Infor-
mation and Software Technology, 43:817–831.
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications
540