(Pathak et al., 2016; Iizuka et al., 2017), and more
natural image-quality improvement has been
achieved. However, these methods are based on the
assumption that the region to be complemented is
known. On the other hand, in free-viewpoint video
generation, it is difficult to identify regions of low im-
age quality, since this depends on the capturing con-
dition. This makes it difficult to apply the conven-
tional image reconstruction technique to solving im-
age-quality degradation.
In this paper, we employ deep learning by gener-
ative adversarial networks (GAN) to learn the rela-
tionship in appearance between generated omnidirec-
tional free-viewpoint (OFV) images and captured im-
ages. By using the learning results (Generator of
GAN), a method to improve the image quality of
OFV images has been developed. It is well known
that the variation in training data affects the efficiency
of deep learning. The appearance of an omnidirec-
tional image is significantly distorted by a unique op-
tical system. Therefore, when the viewpoint of the
omnidirectional camera changes, the appearance of
the same region is also drastically changed. In other
words, the same region is observed with various ap-
pearances. We reduce changes in appearance due to
lens distortion to improve the learning efficiency of
deep learning. In particular, we divide an omnidirec-
tional image into multiple perspective projection im-
ages to reduce the variation in appearance.
2 RELATED WORKS
2.1 Display of Multi-viewpoint
Omnidirectional Images
In Google Street View (Google, 2007), it is possible
to observe the surrounding view by using omnidirec-
tional images. By switching omnidirectional images
shot from multiple viewpoints according to the view-
point movement specified by the observer, it is possi-
ble to grasp the situation in more detail while looking
around the scene. By combining image-blending pro-
cessing and image-shape transform, the observer gets
the sensation that he/she is moving around the scene.
We also estimate the position and rotation of the om-
nidirectional camera and the 3D shape of the captur-
ing space by applying 3D reconstruction processing
to the multi-viewpoint omnidirectional images. Using
the estimated 3D information, we developed the Bul-
let-Time video generation method to switch the view-
point while gazing at the point to be observed
(Takeuchi et al., 2018). However, the omnidirectional
image-switching method has the problem of allowing
the viewer to move only at the capturing position.
2.2 Free-viewpoint Images
There has been much research on free-viewpoint im-
ages. Model-based rendering (MBR) (Agarwal et al.,
2009; Kitahara et al., 2004; Kanade et al., 1997; Shin
et al., 2010; Newcombe et al., 2011; Orts-Escolano et
al., 2016) reproduces a view from an arbitrary view-
point using a 3D computer graphics (CG) model re-
constructed from multi-viewpoint images of the cap-
turing space. Image-based rendering (IBR) (Seitz et
al., 1996; Levoy et al., 1996; Tanimoto et al., 2012;
Matusik et al., 2000; Hedman et al., 2016) synthesizes
the appearance directly from the captured multiple
viewpoint images.
In MBR, the quality of the generated free-view-
point images depends on the accuracy of the recon-
structed 3D CG model. For this reason, when captur-
ing a complicated space where a 3D reconstruction
error is likely to occur, an artifact may occur in the
generated view. Furthermore, the occlusion inherent
in observations with multiple cameras makes it chal-
lenging to reconstruct an accurate 3D shape, thus de-
grading the quality of generated images (Shin et al.,
2010).
Since IBR does not explicitly reconstruct the 3D
shape but applies a simple shape, it is possible to gen-
erate free-viewpoint images without considering the
complexity of the capturing space. However, when
the applied shape of the capturing space is largely dif-
ferent from the actual shape, the appearance of the
generated view is significantly distorted by the image
fitting error. To reduce this distortion and generate an
acceptable view, it is necessary to increase the num-
ber of capturing cameras.
2.3 Image-quality Improvement
Research on image-quality improvement has been
conducted actively. There is a method that comple-
ments the appearance of the image by finding the cor-
responding image information using peripheral image
continuity (Barnes et al., 2009), and this method has
also been applied to complement free-viewpoint
video (Shishido et al., 2017). However, this method
cannot reconstruct information that is not observed in
the image. Various approaches of using convolutional
neural networks and GAN to reconstruct information
not included in the image have been proposed, but
these methods assume that the missing region is
known (Pathak et al., 2016; Iizuka et al., 2017). By
applying reconstruction utilizing GAN to transform