In fact, E is a complex function, but it takes only a
real value as function values, and is not regular as
a complex function. That is, E is not complex dif-
ferentiable. Nevertheless, it is possible to derive a
learning rule by considering the partial differential. In
that case, the learning dynamics of a complex-valued
neuron change according to whether the parameter
(weight and threshold) is considered as orthogonal co-
ordinates system, or it is considered as a polar coor-
dinate system. Although error function E was dis-
cussed, the same argument applies to the complex dif-
ferentiability of activation function f
C
(See (Hirose,
2006, pp.18-22) for details).
A learning rule is derived as follows using the
steepest descent method: For any 0 ≤ k ≤N,
∆r
k
(n)
def
= r
k
(n+ 1) −r
k
(n)
= −ε·
∂E
∂r
k
= ε·Re
h
δ·z
k
·exp[iθ
k
(n)]
i
, (9)
∆θ
k
(n)
def
= θ
k
(n+ 1) −θ
k
(n)
= −ε·
∂E
∂θ
k
= −ε·r
k
(n) ·Im
h
δ·z
k
·exp[iθ
k
(n)]
i
,
(10)
where δ
def
= t −v,
z is a complex conjugate of com-
plex z, and n is a variable that represents the num-
ber of learning cycles. For example, r
k
(n) expresses
the value of parameter r
k
after finishing learning of n
times.
A learning rule of a single complex-valued neu-
ron whose weight is expressed on a polar coordinate
is derivedusing the steepest descent methodin the ref-
erence (Hirose, 2006, pp.59-64). Because the follow-
ing nonlinear function (amplitude - phase type activa-
tion function) is used as the activation function of the
complex-valued neuron concerned, difference in ex-
pression has occurred from the learning rule derived
in this paper:
f
ap
(u) = tanh(|u|) ·exp[i·arg(u)], u ∈C. (11)
For any 0 ≤ k ≤ N, define
M
r
k
def
= { (r, Θ) ∈ M | ∆r
k
= 0 }, (12)
M
θ
k
def
= { (r, Θ) ∈ M | ∆θ
k
= 0 }. (13)
Then learning rules (Eqs. (9) and (10)) yield
M
r
k
=
(r,Θ) ∈ M
Re
h
δz
k
·exp[iθ
k
]
i
= 0
, (14)
M
θ
k
=
(r,Θ) ∈ M
r
k
·Im
h
δz
k
·exp[iθ
k
]
i
= 0
.
(15)
Table 1: Training patterns used in the experiment.
Input Output
Pattern 1 1.0 0.5i
Pattern 2 0.5−0.5i −0.5+ 0.5i
Pattern 3 −0.5−0.5i 1.0−0.5i
Next, the behavior of learning near singular points
is investigated. Near singular point r
k
= 0 ( k =
0,··· , N), for k = 0,··· ,N, Eqs. (9), (10) yield,
∆r
k
= ε·Re
h
δz
k
·exp[iθ
k
]
i
, (16)
∆θ
k
; 0. (17)
Therefore, the velocity of change of amplitude
r
k
( k = 0,··· , N) is higher than the velocity of phase
θ
k
( k = 0,··· , N), and a state is attracted to the sub-
manifold ∩
N
k=0
M
r
k
(State ∆r
k
; 0 (k = 0,··· ,N) is ap-
proached). That is, an equilibrium state ∩
N
k=0
{M
r
k
∩
M
θ
k
} is reached, and consequently parameter (r, Θ) ∈
M will change only slightly. This is a plateau phe-
nomenonin a learning curve, which is the same as that
in learning dynamics near singular points of a real-
valued neural network, as demonstrated in an earlier
study (Amari et al., 2006).
3 EXPERIMENT
In this section, behavior of learning near the singular
points of a polar variable complex-valued neuron is
investigated experimentally.
A polar variable complex-valuedneuron of one in-
put is used for simplicity. The activation function f
C
is assumed as a linear function:
f
C
(z) = z, z = x+ iy. (18)
We assume that the threshold w
0
= r
0
·exp[iθ
0
] ≡
0. That is, the learnable parameter is only one
weight w
1
= r
1
·exp[iθ
1
]. The general steepest de-
scent method (Eqs. (9), (10)) was used in learning.
The learning rate was set to 0.5. Training patterns are
of three types, as shown in Table 1. Learning was
judged to converge, and terminated when the learn-
ing error (1/2)|t −v|
2
dropped to 0.0001 or less (t is
a teacher signal and v is the actual output value of a
polar variable complex-valued neuron).
At the singular point of the polar variable
complex-valued neuron described above, r
1
= 0 (the
amplitude of the weight w
1
is zero). Therefore, the
initial value of r
1
was set to 0.00001, assuming the
case in which learning was started near singular point
r
1
= 0 (Case 1 of Table 2). Moreover, initial value
r
1
= 1.0 was adopted assuming that learning started
ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence
528