of metallurgy (M´ezard et al., 1987)). Fixing J ≡ 1 for
simplicity, the Mattis Hamiltonian reads as
H
Mattis
N
(σ|ξ) = −
1
2N
N,N
∑
i, j
ξ
1
i
ξ
1
j
σ
i
σ
j
−h
N
∑
i
ξ
1
i
σ
i
. (10)
The Mattis magnetization is defined as m
1
=
∑
i
ξ
1
i
σ
i
.
To inspect its lowest energy minima, we perform a
comparison with the CW model: in terms of the (stan-
dard) magnetization, the Curie-Weiss model reads as
H
CW
N
∼ −(N/2)m
2
− hm and, analogously we can
write H
Mattis
N
(σ|ξ) in terms of Mattis magnetization
as H
Mattis
N
∼−(N/2)m
2
1
−hm
1
. It is then evident that,
in the low noise limit (namely where collective prop-
erties may emerge), as the minimum of free energy
is achieved in the Curie-Weiss model for hmi → ±1,
the same holds in the Mattis model for hm
1
i → ±1.
However, this implies that now spins tend to align
parallel (or antiparallel) to the vector ξ
1
, hence if
the latter is, say, ξ
1
= (+1,−1, −1,−1,+1, +1) in
a model with N = 6, the equilibrium configurations
of the network will be σ = (+1,−1,−1, −1,+1,+1)
and σ = (−1,+1,+1, +1,−1,−1), the latter due to
the gauge symmetry σ
i
→−σ
i
enjoyed by the Hamil-
tonian. Thus, the network relaxes autonomously to
a state where some of its neurons are firing while
others are quiescent, according to the stored pattern
ξ
1
. Note that, as the entries of the vectors ξ are
chosen randomly ±1 with equal probability, the re-
trieval of free energy minimum now corresponds to a
spin configuration which is also the most entropic for
the Shannon-McMillan argument, thus both the most
likely and the most difficult to handle (as its informa-
tion compression is no longer possible).
Two remarks are in order now. On the one side,
according to the self-consistency equation (5) and as
shown in Fig. 2 (right), hmi versus h displays the typ-
ical graded/sigmoidal response of a charging neuron
(Tuckwell, 2005), and one would be tempted to call
the spins σ neurons. On the other side, it is definitely
inconvenient to build a network via N spins/neurons,
which are further meant to be diverging (i.e. N → ∞)
in order to handle one stored pattern of information
only. Along the theoretical physics route overcoming
this limitation is quite natural (and provides the first
derivation of the Hebbian prescription in this paper):
If we want a network able to cope with P patterns, the
starting Hamiltonian should have simply the sum over
these P previously stored patterns, namely
H
N
(σ|ξ) = −
1
2N
N,N
∑
i, j=1
P
∑
µ=1
ξ
µ
i
ξ
µ
j
!
σ
i
σ
j
, (11)
where we neglect the external field (h = 0) for sim-
plicity. As we will see in the next section, this Hamil-
tonian constitutes indeed the Hopfield model, namely
the harmonic oscillator of neural networks, whose
coupling matrix is called Hebb matrix as encodes
the Hebb prescription for neural organization (Amit,
1992).
Despite the extension to the case P > 1 is formally
straightforward, the investigation of the system as P
grows becomes by far more tricky. Indeed, neural
networks belong to the so-called “complex systems”
realm. We propose that complex behaviors can be dis-
tinguished by simple behaviors as for the latter the
number of free-energy minima of the system does not
scale with the volume N, while for complex systems
the number of free-energy minima does scale with the
volume according to a proper function of N. For in-
stance, the Curie-Weiss/Mattis model has two minima
only, whatever N (even if N → ∞), and it constitutes
the paradigmatic example for a simple system. As
a counterpart, the prototype of complex system is the
Sherrington-Kirkpatrickmodel (SK), originally intro-
duced in condensed matter to describe the peculiar
behaviors exhibited by real glasses (Hertz and Palmer,
1991; M´ezard et al., 1987). This model has an amount
of minima that scales ∝ exp(cN) with c 6= f(N), and
its Hamiltonian reads as
H
SK
N
(σ|J) =
1
√
N
N,N
∑
i< j
J
ij
σ
i
σ
j
, (12)
where, crucially, coupling are Gaussian distributed as
P(J
ij
) ≡ N [0,1]. This implies that links can be either
positive (hence favoring parallel spin configuration)
as well as negative (hence favoring anti-parallel spin
configuration), thus, in the large N limit, with large
probability, spins will receive conflicting signals and
we speak about “frustrated networks”. Indeed frus-
tration, the hallmark of complexity, is fundamental in
order to split the phase space in several disconnected
zones, i.e. in order to have several minima, or sev-
eral stored patterns in neural network language. This
mirrors a clear request also in electronics, namely the
need for inverting amplifiers too.
The mean-field statistical mechanics for the low-
noise behavior of spin-glasses has been first described
by Parisi and it predicts a hierarchical organization of
states and a relaxational dynamics spread over many
timescales (for which we refer to specific textbooks
(M´ezard et al., 1987)). Here we just need to knowthat
their natural order parameter is no longer the mag-
netization (as these systems do not magnetize), but
the overlap q
ab
, as we are explaining. Spin glasses
are balanced ensembles of ferromagnets and antifer-
romagnets (this can also be seen mathematically as
P(J) is symmetric around zero) and, as a result, hmi is
always equal to zero, on the other hand, a comparison
between two realizations of the system (pertaining to
AWalkintheStatisticalMechanicalFormulationofNeuralNetworks-AlternativeRoutestoHebbPrescription
215