ternational Conference on Learning Representations
(ICLR).
Karras, T., Aila, T., Laine, S., and Lehtinen, J. (2018b).
Progressive Growing of GANs for Improved Quality,
Stability, and Variation. In Proceedings of the In-
ternational Conference on Learning Representations
(ICLR).
Kingma, D. P. and Ba, J. (2015). Adam: A Method for
Stochastic Optimization. In Proceedings of the In-
ternational Conference on Learning Representations
(ICLR).
Kundu, J. N., Seth, S., Jampani, V., Rakesh, M., Babu, R. V.,
and Chakraborty, A. (2020). Self-Supervised 3D Hu-
man Pose Estimation via Part Guided Novel Image
Synthesis. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition (CVPR),
pages 6151–6161.
Lakhal, M. I., Boscaini, D., Poiesi, F., Lanz, O., and Caval-
laro, A. (2020). Novel-view human action synthesis.
In Proceedings of the Asian Conference on Computer
Vision (ACCV).
Lakhal, M. I., Lanz, O., and Cavallaro, A. (2019). View-
LSTM: Novel-view video synthesis through view de-
composition. In Proceedings of the International
Conference on Computer Vision (ICCV), pages 7576–
7586.
Liu, W., Piao, Z., Jie, M., Luo, W., Ma, L., and Gao, S.
(2019). Liquid Warping GAN: A Unified Framework
for Human Motion Imitation, Appearance Transfer
and Novel View Synthesis. In Proceedings of the In-
ternational Conference on Computer Vision (ICCV),
pages 5903–5912.
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., and
Black, M. J. (2015). SMPL: A Skinned Multi-Person
Linear Model. ACM Transactions on Graphics (TOG),
34(6):248:1–248:16.
Ma, L., Jia, X., Sun, Q., Schiele, B., Tuytelaars, T., and
Van Gool, L. (2017). Pose Guided Person Image Gen-
eration. In Neural Information Processing Systems
(NeurIPS), pages 406–416.
Men, Y., Mao, Y., Jiang, Y., Ma, W.-Y., and Lian, Z. (2020).
Controllable Person Image Synthesis with Attribute-
Decomposed GAN. In Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition
(CVPR), pages 5083–5092.
Niklaus, S., Mai, L., Yang, J., and Liu, F. (2019). 3D Ken
Burns Effect from a Single Image. ACM Transactions
on Graphics (TOG), 38(6).
Raaj, Y., Idrees, H., Hidalgo, G., and Sheikh, Y. (2019).
Efficient Online Multi-Person 2D Pose Tracking With
Recurrent Spatio-Temporal Affinity Fields. In Pro-
ceedings of the IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), pages 4615–4623.
Saito, S., Huang, Z., Natsume, R., Morishima, S.,
Kanazawa, A., and Li, H. (2019). PIFu: Pixel-Aligned
Implicit Function for High-Resolution Clothed Hu-
man Digitization. In Proceedings of the International
Conference on Computer Vision (ICCV), pages 2304–
2314.
Shaham, T. R., Dekel, T., and Michaeli, T. (2019). SinGAN:
Learning a Generative Model From a Single Natural
Image. In Proceedings of the International Confer-
ence on Computer Vision (ICCV), pages 4569–4579.
Shahroudy, A., Liu, J., Ng, T.-T., and Wang, G. (2016).
NTU RGB+D: A large scale dataset for 3D human ac-
tivity analysis. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition (CVPR),
pages 1010–1019.
Shin, D., Ren, Z., Sudderth, E. B., and Fowlkes, C. C.
(2019). 3D Scene Reconstruction with Multi-layer
Depth and Epipolar Transformers. In Proceedings
of the International Conference on Computer Vision
(ICCV), pages 2172–2182.
Tulyakov, S., Liu, M.-Y., Yang, X., and Kautz, J. (2018).
MoCoGAN: Decomposing motion and content for
video generation. In Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition
(CVPR), pages 1526–1535.
Unterthiner, T., van Steenkiste, S., Kurach, K., Marinier,
R., Michalski, M., and Gelly, S. (2019). FVD: A new
Metric for Video Generation. In Proceedings of the In-
ternational Conference on Learning Representations
(ICLR) Workshops.
Vlasic, D., Baran, I., Matusik, W., and Popovi
´
c, J. (2008).
Articulated Mesh Animation from Multi-View Silhou-
ettes. In Proceedings of SIGGRAPH, page 1–9.
Wang, T.-C., Liu, M.-Y., Tao, A., Liu, G., Kautz, J., and
Catanzaro, B. (2019). Few-shot Video-to-Video Syn-
thesis. In Neural Information Processing Systems
(NeurIPS), pages 5013–5024.
Wiles, O., Gkioxari, G., Szeliski, R., and Johnson, J. (2020).
SynSin: End-to-End View Synthesis From a Single
Image. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR),
pages 7465–7475.
Wu, S., Rupprecht, C., and Vedaldi, A. (2020). Unsuper-
vised Learning of Probably Symmetric Deformable
3D Objects From Images in the Wild. In Proceed-
ings of the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pages 1–10.
Yang, Y. and Ramanan, D. (2013). Articulated Human De-
tection with Flexible Mixtures of Parts. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence
(PAMI), 35(12):2878–2890.
Zhang, C., Pujades, S., Black, M. J., and Pons-Moll, G.
(2017). Detailed, accurate, human shape estimation
from clothed 3D scan sequences. In Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pages 5484–5493.
Zhou Wang, Bovik, A. C., Sheikh, H. R., and Simoncelli,
E. P. (2004). Image Quality Assessment: From Error
Visibility to Structural Similarity. IEEE Transactions
on Image Processing, 13(4):600–612.
Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. (2017).
Unpaired image-to-image translation using cycle-
consistent adversarial networks. In Proceedings of the
International Conference on Computer Vision (ICCV),
pages 2242–2251.
Zhu, Z., Huang, T., Shi, B., Yu, M., Wang, B., and Bai, X.
(2019). Progressive pose attention transfer for person
image generation. In Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition
(CVPR), pages 2342–2351.
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
238