There is no enough dataset which includes 3D
joint rotations on images, so we focus on normal twist
rotations around the arms. As future work, we will
create a larger dataset to evaluate our model in more
detail and estimate twist rotations around the other
limbs including variety of poses. We also believe cre-
ating a larger dataset for training and applying data
balancing to twist angles bias would be effective
strategies to improve the robustness and accuracy of
our model. We will also tackle the problems of pose
reconstruction from noisy 3D body joint locations and
make a performance comparison between previous
approaches and ours.
REFERENCES
Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero,
J., & Black, M. J. (2016). Keep It SMPL: Automatic
Estimation of 3D Human Pose and Shape from a Single
Image. ECCV, 561–578.
Dobrowolski, P. (2015). Swing-twist decomposition in
Clifford algebra. In arXiv [cs.RO]. arXiv.
http://arxiv.org/abs/1506.05481
Ionescu, C., Papava, D., Olaru, V., & Sminchisescu, C.
(2014). Human3.6M: Large Scale Datasets and Predic-
tive Methods for 3D Human Sensing in Natural Envi-
ronments. TPAMI/PAMI, 36(7), 1325–1339.
Joo, H., Liu, H., Tan, L., Gui, L., Nabbe, B., Matthews, I.,
Kanade, T., Nobuhara, S., & Sheikh, Y. (2015). Panop-
tic studio: A massively multiview system for social mo-
tion capture. ICCV, 3334–3342.
Kanazawa, A., Black, M. J., Jacobs, D. W., & Malik, J.
(2018). End-to-end recovery of human shape and pose.
CVPR, 7122–7131.
Kolotouros, N., Pavlakos, G., Black, M. J., & Daniilidis, K.
(2019a). Learning to reconstruct 3D human pose and
shape via model-fitting in the loop. ICCV, 2252–2261.
Kolotouros, N., Pavlakos, G., & Daniilidis, K. (2019b).
Convolutional mesh regression for single-image human
shape reconstruction. CVPR, 4501–4510.
Lassner, C., Romero, J., Kiefel, M., Bogo, F., Black, M. J.,
& Gehler, P. V. (2017). Unite the people: Closing the
loop between 3d and 2d human representations. CVPR,
6050–6059.
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., &
Black, M. J. (2015). SMPL: a skinned multi-person lin-
ear model. TOG, 34(6), 1–16.
Mehta, D., Sotnychenko, O., Mueller, F., Xu, W., Elgharib,
M., Fua, P., Seidel, H.-P., Rhodin, H., Pons-Moll, G.,
& Theobalt, C. (2020). XNect: Real-time Multi-Person
3D Motion Capture with a Single RGB Camera. TOG,
39(4), 82:1–82:17.
Mehta, D., Sridhar, S., Sotnychenko, O., Rhodin, H.,
Shafiei, M., Seidel, H.-P., Xu, W., Casas, D., & Theo-
balt, C. (2017). VNect: real-time 3D human pose esti-
mation with a single RGB camera. TOG, 36(4), 1–14.
Murthy, P., Butt, H. T., Hiremath, S., Khoshhal, A., &
Stricker, D. (2019). Learning 3D joint constraints from
vision-based motion capture datasets. IPSJ Transac-
tions on Computer Vision and Applications, 11(1), 5.
Omran, M., Lassner, C., Pons-Moll, G., Gehler, P., &
Schiele, B. (2018). Neural Body Fitting: Unifying Deep
Learning and Model Based Human Pose and Shape Es-
timation. 3DV, 484–494.
Pavlakos, G., Choutas, V., Ghorbani, N., Bolkart, T., Os-
man, A. A., Tzionas, D., & Black, M. J. (2019). Expres-
sive Body Capture: 3D Hands, Face, and Body From a
Single Image. CVPR, 10967–10977.
Pavlakos, G., Zhu, L., Zhou, X., & Daniilidis, K. (2018).
Learning to estimate 3D human pose and shape from a
single color image. CVPR, 459–468.
Rong, Y., Liu, Z., Li, C., Cao, K., & Loy, C. C. (2019).
Delving deep into hybrid annotations for 3d human re-
covery in the wild. ICCV, 5340–5348.
Simonyan, K., & Zisserman, A. (2015). Very Deep Convo-
lutional Networks for Large-Scale Image Recognition.
ICLR.
Varol, G., Ceylan, D., Russell, B., Yang, J., Yumer, E., Lap-
tev, I., & Schmid, C. (2018). Bodynet: Volumetric in-
ference of 3d human body shapes. ECCV, 20–36.
Xiang, D., Joo, H., & Sheikh, Y. (2019). Monocular total
capture: Posing face, body, and hands in the wild.
CVPR, 10965–10974.
Xu, J., Yu, Z., Ni, B., Yang, J., Yang, X., & Zhang, W.
(2020). Deep Kinematics Analysis for Monocular 3D
Human Pose Estimation. CVPR, 899–908.
Zhou, Y., Barnes, C., Lu, J., Yang, J., & Li, H. (2019). On
the Continuity of Rotation Representations in Neural
Networks. CVPR, 5738–5746.