Authors:
Yasutomo Kawanishi
1
;
Hitoshi Nishimura
2
and
Hiroshi Murase
3
Affiliations:
1
Guardian Robot Project, RIKEN, Kyoto, Japan
;
2
KDDI Research, Saitama, Japan
;
3
Graduate School of Informatics, Nagoya University, Aichi, Japan
Keyword(s):
Low Resolution, Human Pose Estimation, Temporal Information.
Abstract:
This paper addresses the problem of human pose estimation from an extremely low-resolution (ex-low) image sequence. In an ex-low image (e.g., 16 × 16 pixels), it is challenging, even for human beings, to estimate the human pose smoothly and accurately only from a frame because of resolution and noise. This paper proposes a human pose estimation method, named Pose Transition Embedding Network, that considers the temporal continuity of human pose transition by using a pose-embedded manifold. This method first builds a pose transition manifold from the ground truth of human pose sequences to learn feasible pose transitions using an encoder-decoder model named Pose Transition Encoder-Decoder. Then, an image encoder, named Ex-Low Image Encoder Transformer, encodes an ex-low image sequence into an embedded vector using a transformer-based network. Finally, the estimated human pose is reconstructed using a pose decoder named Pose Transition Decoder. The performance of the method is confirme
d by evaluating an ex-low human pose dataset generated from a publicly available action recognition dataset.
(More)