Authors:
Hayato Yumiya
1
;
Daisuke Deguchi
1
;
Yasutomo Kawanishi
2
and
Hiroshi Murase
1
Affiliations:
1
Institute of Intelligent System, Nagoya University, Japan
;
2
RIKEN GRP, Japan
Keyword(s):
3D Human Posture, Gaze Grounding, Metric Learning, Person-to-Person Differences.
Abstract:
In this study, we address a novel problem with end-to-end gaze grounding, which estimates the area of an object at which a person in an image is gazing, especially focusing on images of people seen from behind. Existing methods usually estimate facial information such as eye gaze and face orientation first, and then estimate the area at which the target person is gazing; they do not work when a person is pictured from behind. In this study, we focus on individual’s posture, which is a feature that can be obtained even from behind. Posture changes depending on where a person is looking, although this varies from person to person. In this study, we proposes an end-to-end model designed to estimate the area at which a person is gazing from their 3D posture. To minimize differences between individuals, we also introduce the Posture Embedding Encoder Module as a metric learning module. To evaluate the proposed method, we constructed an experimental environment in which a person gazed at a
certain object on a shelf. We constructed a dataset consisting of pairs of 3D skeletons and gazes. In an evaluation on this dataset,HEREHEREHEREwe confirmed that the proposed method can estimate the area at which a person is gazing from behind.
(More)