Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models

Ziyi Chang, Edmund Findlay, Haozheng Zhang, Hubert P. H. Shum

2023

Abstract

Generating realistic motions for digital humans is a core but challenging part of computer animations and games, as human motions are both diverse in content and rich in styles. While the latest deep learning approaches have made significant advancements in this domain, they mostly consider motion synthesis and style manipulation as two separate problems. This is mainly due to the challenge of learning both motion contents that account for the inter-class behaviour and styles that account for the intra-class behaviour effectively in a common representation. To tackle this challenge, we propose a denoising diffusion probabilistic model solution for styled motion synthesis. As diffusion models have a high capacity brought by the injection of stochasticity, we can represent both inter-class motion content and intra-class style behaviour in the same latent. This results in an integrated, end-to-end trained pipeline that facilitates the generation of optimal motion and exploration of content-style coupled latent space. To achieve high-quality results, we design a multi-task architecture of diffusion model that strategically generates aspects of human motions for local guidance. We also design adversarial and physical regulations for global guidance. We demonstrate superior performance with quantitative and qualitative results and validate the effectiveness of our multi-task architecture.

Download


Paper Citation


in Harvard Style

Chang Z., Findlay E., Zhang H. and Shum H. (2023). Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 1: GRAPP; ISBN 978-989-758-634-7, SciTePress, pages 64-74. DOI: 10.5220/0011631000003417


in Bibtex Style

@conference{grapp23,
author={Ziyi Chang and Edmund Findlay and Haozheng Zhang and Hubert P. H. Shum},
title={Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models},
booktitle={Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 1: GRAPP},
year={2023},
pages={64-74},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011631000003417},
isbn={978-989-758-634-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 1: GRAPP
TI - Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models
SN - 978-989-758-634-7
AU - Chang Z.
AU - Findlay E.
AU - Zhang H.
AU - Shum H.
PY - 2023
SP - 64
EP - 74
DO - 10.5220/0011631000003417
PB - SciTePress