REFERENCES
Alp G
¨
uler, R., Neverova, N., and Kokkinos, I. (2018).
Densepose: Dense human pose estimation in the wild.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pages 7297–7306.
Arjovsky, M., Chintala, S., and Bottou, L. (2017). Wasser-
stein generative adversarial networks. In International
conference on machine learning, pages 214–223.
Cao, Z., Simon, T., Wei, S.-E., and Sheikh, Y. (2017). Real-
time multi-person 2d pose estimation using part affin-
ity fields. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pages
7291–7299.
Chen, X., Song, J., and Hilliges, O. (2019). Un-
paired pose guided human image generation. CoRR,
abs/1901.02284.
Everingham, M., Van Gool, L., Williams, C. K., Winn, J.,
and Zisserman, A. (2007). The pascal visual object
classes challenge 2007 (voc2007) results.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., Courville, A., and Ben-
gio, Y. (2014). Generative adversarial nets. In
Advances in neural information processing systems,
pages 2672–2680.
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and
Courville, A. C. (2017). Improved training of wasser-
stein gans. In Advances in neural information pro-
cessing systems, pages 5767–5777.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Horiuchi, Y., Iizuka, S., Simo-Serra, E., and Ishikawa, H.
(2019). Spectral normalization and relativistic ad-
versarial training for conditional pose generation with
self-attention. In 2019 16th International Conference
on Machine Vision Applications (MVA), pages 1–5.
IEEE.
Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2017).
Image-to-image translation with conditional adversar-
ial networks. In Proceedings of the IEEE conference
on computer vision and pattern recognition, pages
1125–1134.
Jetchev, N. and Bergmann, U. (2017). The conditional anal-
ogy gan: Swapping fashion articles on people images.
Jolicoeur-Martineau, A. (2018). The relativistic discrimina-
tor: a key element missing from standard gan. arXiv
preprint arXiv:1807.00734.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet classification with deep convolutional neural
networks. In Advances in neural information process-
ing systems, pages 1097–1105.
Lassner, C., Pons-Moll, G., and Gehler, P. V. (2017). A gen-
erative model of people in clothing. In Proceedings of
the IEEE International Conference on Computer Vi-
sion, pages 853–862.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.,
Fu, C.-Y., and Berg, A. C. (2016a). Ssd: Single shot
multibox detector. In European conference on com-
puter vision, pages 21–37. Springer.
Liu, Z., Luo, P., Qiu, S., Wang, X., and Tang, X. (2016b).
Deepfashion: Powering robust clothes recognition and
retrieval with rich annotations. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 1096–1104.
Ma, L., Jia, X., Sun, Q., Schiele, B., Tuytelaars, T., and
Van Gool, L. (2017). Pose guided person image gener-
ation. In Advances in Neural Information Processing
Systems, pages 406–416.
Miyato, T., Kataoka, T., Koyama, M., and Yoshida, Y.
(2018). Spectral normalization for generative adver-
sarial networks. arXiv preprint arXiv:1802.05957.
Neverova, N., Alp Guler, R., and Kokkinos, I. (2018).
Dense pose transfer. In The European Conference on
Computer Vision (ECCV).
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V.,
Radford, A., and Chen, X. (2016). Improved tech-
niques for training gans. In Advances in neural infor-
mation processing systems, pages 2234–2242.
Siarohin, A., Sangineto, E., Lathuili
`
ere, S., and Sebe, N.
(2018). Deformable gans for pose-based human image
generation. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pages
3408–3416.
Stewart, M. (2019 (accessed May 8, 2019)). Advanced
Topics in Generative Adversarial Networks (GANs).
https://towardsdatascience.com/comprehensive-
introduction-to-turing-learning-and-gans-part-2-
fd8e4a70775.
Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., and
Catanzaro, B. (2018). High-resolution image synthe-
sis and semantic manipulation with conditional gans.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pages 8798–8807.
Wang, Z., Bovik, A. C., Sheikh, H. R., Simoncelli, E. P.,
et al. (2004). Image quality assessment: from error
visibility to structural similarity. IEEE transactions
on image processing, 13(4):600–612.
Zhang, H., Goodfellow, I., Metaxas, D., and Odena, A.
(2018). Self-attention generative adversarial net-
works. arXiv preprint arXiv:1805.08318.
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., and Tian,
Q. (2015). Scalable person re-identification: A bench-
mark. In Proceedings of the IEEE International Con-
ference on Computer Vision, pages 1116–1124.
Generation of Human Images with Clothing using Advanced Conditional Generative Adversarial Networks
41