reconstruction tasks. For the reconstruction of 64x64
regions on 128x128 resolution images, the entire
GANs training typically requires 20 epochs and takes
approximately 45 minutes on a single Nvidia
RTX3090 GPU. However, by parallelizing multiple
GPUs, the training time can be significantly reduced.
4 CONCLUSION
Intuitively, for better repairing defective areas of an
image, it is essential to understand the content of the
surrounding images, and use the correlation of the
image area content to infer the defective content.
Previous research from Pathak et al’s study (2016)
found that the repaired images areas generally are
blurry, especially contours around the eyes and nose.
During the training process, I also found that even after
multiple rounds of training, even if the image contour
is blurry, the discriminator's error remains basically
unchanged. That is, the discriminator is no longer
sensitive to this kind of blurring, so I decided to use
relevant pre-blurred image as training set and also
combine context with segmentation analysis to
enhance the discriminator's sensitivity to blurry
contour.
This research proposes a hybrid neural network
discriminator network structure of resnet50 and U-net
for image inpainting, which has good perception
ability for facial textures and segmentation structures.
When combined with GANs, it can significantly
reduce the blurring in reconstructed images.
Specifically, firstly, a 2D Gaussian filter is used to
randomly blur the images to generate a pre-trained
blurred image set. Secondly, through the pre-training
by using the blurred image set, the hybrid discriminator
can discriminate the original image from the blurred
image. Thirdly, build GANs networks for image
generation training, and compare the reconstructed loss
values and the quality of reconstructed images between
this study and other papers. The experimental results
indicate a significant decrease in the loss, while the
subjective recognition effect of some reconstructed
images is also significantly improved, particularly in
areas such as the nose and eyes, where the edge source
is more pronounced.
In the future, further work about optimizing the
discriminator network structure, pre-training method
and loss models design will be continued to recognize
different size texture and thus achieving better general
image reconstruction results.
REFERENCES
R. Shah, A. Gautam and S. K. Singh, “Overview of Image
Inpainting Techniques: A Survey,” 2022 IEEE Region
10 Symposium (TENSYMP), 2022, pp. 1-6.
D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A.A.
Efros. “Context encoders: Feature learning by
inpainting,” Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, 2016, pp.
2536–2544.
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D.
Warde-Farley, S. Ozair, A. Courville, and Y. Bengio.
“Generative adversarial nets,” Proceedings of
Advances in neural information processing systems
(NIPS), vol. 27, 2014, pp. 2672–2680.
S. Iizuka, E. Simo-Serra, and H. Ishikawa. “Globally and
locally consistent image completion,” ACM
Transactions on Graphics, vol. 36, 2017, pp. 1–14.
J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E.
Tzeng, and T. Darrell. “Decaf: A deep convolutional
activation feature for generic visual recognition,”
Proceedings of the 31st International Conference on
Machine Learning (ICML), 2014, pp. 647-655.
C. Doersch, A. Gupta, and A.A. Efros. “Unsupervised visual
representation learning by context prediction,”
Proceedings of the IEEE International Conference on
Computer Vision (ICCV), 2015, pp. 1422-1430.
R. A. Yeh, C. Chen, T. Yian Lim, A. G. Schwing, M.
Hasegawa-Johnson, and M. N. Do. “Semantic image
inpainting with deep generative models,” Proceedings
of the IEEE Conference on Computer Vision and
Pattern Recognition, 2017, pp. 5485–5493.
Dataset
https://www.kaggle.com/datasets/badasstechie/celeba
hq-resized-256x256
K. He, X. Zhang, S. Ren and J. Sun. “Deep Residual
Learning for Image Recognition,” IEEE Conference on
Computer Vision and Pattern Recognition (CVPR),
2016, pp. 770-778.
Z. Y. Shen, W.S. Lai, T.F. Xu, J. Kautz, Y. Ming-Hsuan,
“Deep Semantic Face Deblurring,” Proceedings of the
IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2018, pp. 8260-8269.
O. Ronneberger, P. Fischer, T. Brox, “U-net: convolutional
networks for biomedical image segmentation,”
Proceedings of Springer International conference on
medical image computing and computer-assisted
intervention, 2015, pp 234–241.
DAML 2023 - International Conference on Data Analysis and Machine Learning
196