of using alternative learning algorithms such as Min-
imerror (Torres-Moreno et al., 2002), which is a per-
ceptron learning rule with a cost function designed to
reduce the number of classification errors rather than
mean squared error. The MOHN and its learning rules
would also be usefully compared to deep networks as
they present a start contrast in approach.
The issue of MOHN structure discovery was also
raised, but the detail is left for future work. The exper-
iments presented in this paper worked on the assump-
tion that the networks in question contained weights
of sufficient order to capture the functions on which
they were trained. This becomes increasingly difficult
as the number of inputs grows. Problems with large
numbers of inputs require a structure discovery phase
to be carried out as part of the training process.
With a given network structure, training a MOHN
is faster and has less error variance across trials than
training with an MLP. Additionally, the training al-
gorithm has no local minima when training a fixed
structure MOHN, making training more reliable than
that of an MLP. Of course, any algorithm used to dis-
cover the correct structure for the MOHN may well
have local optima, but that (again) is a matter for fu-
ture work.
REFERENCES
Beauchamp, K. (1984). Applications of Walsh and Related
Functions. Academic Press, London.
Caparr
´
os, G. J., Ruiz, M. A. A., and Hern
´
andez, F. S.
(2002). Hopfield neural networks for optimization:
study of the different dynamics. Neurocomputing,
43(1-4):219–237.
Dobson, A. J. and Barnett, A. (2011). An introduction to
generalized linear models. CRC press.
Frean, M. (1990). The upstart algorithm: A method for con-
structing and training feedforward neural networks.
Neural computation, 2(2):198–209.
Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Fried-
man, J., and Tibshirani, R. (2009). The elements of
statistical learning, volume 2. Springer.
Heckendorn, R. B. and Wright, A. H. (2004). Efficient link-
age discovery by limited probing. Evolutionary com-
putation, 12(4):517–545.
Hopfield, J. J. (1982). Neural networks and physical sys-
tems with emergent collective computational abili-
ties. Proceedings of the National Academy of Sciences
USA, 79(8):2554–2558.
Hopfield, J. J. and Tank, D. W. (1985). Neural computa-
tion of decisions in optimization problems. Biological
Cybernetics, 52:141–152.
Kubota, T. (2007). A higher order associative memory with
Mcculloch-Pitts neurons and plastic synapses. In Neu-
ral Networks, 2007. IJCNN 2007. International Joint
Conference on, pages 1982 –1989.
Pelikan, M., Goldberg, D. E., and Cant
´
u-paz, E. E.
(2000). Linkage problem, distribution estimation,
and bayesian networks. Evolutionary Computation,
8(3):311–340.
Shakya, S., McCall, J., Brownlee, A., and Owusu, G.
(2012). Deum - distribution estimation using markov
networks. In Shakya, S. and Santana, R., editors,
Markov Networks in Evolutionary Computation, vol-
ume 14 of Adaptation, Learning, and Optimization,
pages 55–71. Springer Berlin Heidelberg.
Swingler, K. (2012). On the capacity of Hopfield neural
networks as EDAs for solving combinatorial optimi-
sation problems. In Proc. IJCCI (ECTA), pages 152–
157. SciTePress.
Swingler, K. (2014). A walsh analysis of multilayer percep-
tron function. In Proc. IJCCI (NCTA), pages –.
Swingler, K. and Smith, L. (2014a). Training and making
calculations with mixed order hyper-networks. Neu-
rocomputing, (141):65–75.
Swingler, K. and Smith, L. S. (2014b). An analysis of
the local optima storage capacity of hopfield network
based fitness function models. Transactions on Com-
putational Collective Intelligence XVII, LNCS 8790,
pages 248–271.
Tibshirani, R. (1996). Regression shrinkage and selection
via the lasso. Journal of the Royal Statistical Society.
Series B (Methodological), pages 267–288.
Torres-Moreno, J.-M., Aguilar, J., and Gordon, M.
(2002). Finding the number minimum of errors in n-
dimensional parity problem with a linear perceptron.
Neural Processing Letters, 1:201–210.
Torres-Moreno, J.-M. and Gordon, M. B. (1998). Efficient
adaptive learning for classification tasks with binary
units. Neural Computation, 10(4):1007–1030.
Venkatesh, S. S. and Baldi, P. (1991). Programmed inter-
actions in higher-order neural networks: Maximal ca-
pacity. Journal of Complexity, 7(3):316–337.
Walsh, J. (1923). A closed set of normal orthogonal func-
tions. Amer. J. Math, 45:5–24.
Wilson, G. V. and Pawley, G. S. (1988). On the stability of
the travelling salesman problem algorithm of hopfield
and tank. Biol. Cybern., 58(1):63–70.
A Comparison of Learning Rules for Mixed Order Hyper Networks
27