3.1 Introduced Weight: Learning of
Generator
In previous CGANs, conditions vanish near the output
layer because the conditional vector y
y
y ∈ {0, 1} is only
in the input layer. Thus, the generator in the proposed
method inputs conditions to a hidden layer other than
the input layer in like a skip connection. This appro-
ach can certainly reflect conditions until the near the
output layer. In addition, previous facial image ge-
neration methods directly input the binary condition
vector to the generator. On the other hand, the pro-
posed method applies 1 × 1 convolution process and
sigmoid function to the condition vector y
y
y expressed
in binary and inputs its output to the generator. There-
fore, we represent a continuous value y
y
y from 0 to 1 as
a condition vector. Moreover, each condition can be
weighted because the filter size of the convolutional
process is 1 × 1. By weighting conditions, the propo-
sed method is able to stepwisely reflect conditions in
such a way as to whole the generator because the most
suitable conditions can be reflected at the time of ge-
neration in each layer. Furthermore, we use Pixelwise
Normalization instead of Batch Normalization. Pixel-
wise Normalization is a normalization method used in
PGGANs that is able to improve the quality of gene-
rated images. Pixelwise Normalization is represented
as
b
x,y
=
a
x,y
1
N
∑
N−1
j=0
(a
j
x,y
)
2
+ ε
, (3)
where N is the number of feature maps, a
x,y
and b
x,y
is
the feature vector before and after and ε = 10
−8
. This
series of processes is indicated in Figure 1.
3.2 Multi-task Discriminator
The discriminator inputs real or generated images and
simultaneously considers inputted conditions to dis-
tinguish between the real images or generated ones
are inputted to the discriminator, which simultane-
ously considers inputted conditions to distinguish be-
tween the images. The discriminator in our proposed
method improves multi-tasks so as to recognize given
conditions when the generator generates images. Fi-
gure 2 shows a multi-task network. The adversarial
branch and recognition branch in Figure 2 represent
a previous task of GANs and condition recognition,
respectively. In CGANs and Conditional DCGANs,
conditions are also given to the discriminator, but in
the proposed method add the recognition branch. It is
able to be considered alternative input conditions by
minimizing the condition recognition error, which is
computed by using the conditions inputted to the ge-
nerator. Minibatch Stddev is the standard deviation
for Mini Batch calibration. This proposed method at
PGGANs is able to generate diverse images.
Condition recognition error is added to the objective
function of previous CGANs. Thereby, adversarial le-
arning of the generator reflects more conditions. The
objective function of our proposed method is indica-
ted as
min
G
max
D
V (D, G) = E
x
x
x∼P
data
(x
x
x)
[logD(x
x
x)] +
E
z
z
z∼P(z
z
z)
[log(1 − D(
˜
x
x
x)] ∧ min L, (4)
where L is condition recognition error. If a dataset
of real facial images is used, our proposed method
finds it difficult or impossible to recognize the images
by using the softmax function and cross entropy error
because multiple facial attributes in this dataset are re-
presented in binary. When mean square error is used,
the recognition branch of the number of attributes to
be recognized is required and calculation cost is high.
Hence, we calculate error by sigmoid cross entropy
because we calculate recognition error of multiple fa-
cial attributes with a one the recognition branch.
3.3 Obtain Feature Vector: Encoder
and Fine-tunned Generator
Generative methods such as α-GANs and BiGANs
use adversarial learning and an encoder and generate
images without fine-tunned the generator. Therefore,
generative methods frequently generate unclear ima-
ges in initial learning. In addition, previous techni-
ques are high cost because they require multiple net-
works to be updated. Thus, we propose a way of lear-
ning that uses an encoder and a fine-tuned generator.
Clear facial images can be generated from initial le-
arning using the fine-tuned generator, and our method
generates images that maintain the identity of input-
ted images by inputting features obtained from the en-
coder to generator. Algorithm 1 details the proposed
learning process, and Figure 3 is illustration of prior.
Both f
f
f and
ˆ
f
f
f are features output from the Encoder,
but the former is real images, and latter is generated
images. Moreover, all L in Algorithm 1 are Mean
Squared Error, but these errors are different. L
real
is the error of real images and their reconstructions,
L
noise
is error of the noise vector and embedded fea-
tures of image generated from the noise vector, and
L
f ake
is error of reconstructed images and image ge-
nerated from the noise vector. In our proposed lear-
ning algorithm fixes parameters of the generator and
updates only the encoder.
Facial Image Generation by Generative Adversarial Networks using Weighted Conditions
141