3D Gesture Recognition by Superquadrics
Ilya Afanasyev and Mariolino De Cecco
Mechatronics Lab., University of Trento, via Mesiano, 77, Trento, Italy
Keywords: Superquadrics, Gesture Recognition, Microsoft Kinect, RANSAC Fitting, 3D Object Localization.
Abstract: This paper presents 3D gesture recognition and localization method based on processing 3D data of hands in
color gloves acquired by 3D depth sensor, like Microsoft Kinect. RGB information of every 3D datapoints
is used to segment 3D point cloud into 12 parts (a forearm, a palm and 10 for fingers). The object (a hand
with fingers) should be a-priori known and anthropometrically modeled by SuperQuadrics (SQ) with certain
scaling and shape parameters. The gesture (pose) is estimated hierarchically by RANSAC-object search
with a least square fitting the segments of 3D point cloud to corresponding SQ-models: at first – a pose of
the hand (forearm & palm), and then positions of fingers. The solution is verified by evaluating the
matching score, i.e. the number of inliers corresponding to the appropriate distances from SQ surfaces and
3D datapoints, which are satisfied to an assigned distance threshold.
1 INTRODUCTION
Gesture recognition, having the goal of interpreting
human gestures via mathematical algorithms, is the
important topic in computer vision with many
potential applications such as human-computer
interaction, sign language recognition, games, sport,
medicine, video surveillance, etc. The model-based
methods of hand gesture tracking have been studied
by a high number of researchers (Rehg and Kanade,
1995); (Starner and Pentland, 1995); (Heap and
Hogg, 1996); (Zhou and Huang, 2003); (La Gorce,
et al., 2008). Some publications used hand tracking
by color gloves with data acquired by fixed-position
webcams (Geebelen et al., 2010) or a single camera
(Wang and Popović, 2009). The hand tracking with
quadrics was used by the authors (Stenger et al.,
2001), but they had a model consisted of 39
quadrics, representing only palm and fingers.
The proposed method of 3D gesture recognition
by SQ is close to the corresponding hierarchical
method (Afanasyev et al., 2012) for 3D Human
Body pose estimation by SQ applied for processing
3D data captured by a multi-camera system and
segmented by a special preprocessing clothing
algorithm. In this paper, the object of recognition is
hand gesture; the sensor is MS Kinect; 3D point
cloud segmentation is provided by analyzing Kinect
RGB-depth data of color gloves. As far as a hand
and fingers can be a priori modeled with
anthropometric parameters in a metric coordinate
system, we propose using the hierarchical
RANSAC-based model-fitting technique with the
composite SQ-models. As known SQs can be used
for description of complex-geometry objects with
few parameters and generation of a simple
minimization function of an object pose (Jaklic et
al., 2000) and (Leonardis et al., 1997). The logic of
3D Gesture Recognition algorithm is clarified by the
block diagram (Fig. 1).
The gesture recognition starts with pre-
processing 3D datapoints (captured by MS Kinect),
segmenting them into 12 parts (forearm, palm and
10 for fingers) according to colors of gloves. Then
the algorithm recovers 3D pose of the hand as the
largest object (“Hand Pose Search”) and after that
restores fingers pose (“Fingers Pose Search”). To
cope with measurement noise and outliers, the Pose
Search is estimated by RANSAC-SQ-fitting
technique. The fitting quality is controlled by inlier
thresholds (for hand & fingers), which are a ratio of
the optimal amount of inliers to whole data points.
The tests showed that Hand Pose Search can give a
wrong palm position satisfying a palm threshold, but
troubling Fingers Pose Search. For this reason, when
a finger inliers solution less than a finger threshold,
the algorithm restarts the Hand Pose Search again
until finding suitable results for every fingers.
429
Afanasyev I. and De Cecco M..
3D Gesture Recognition by Superquadrics.
DOI: 10.5220/0004348404290433
In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2013), pages 429-433
ISBN: 978-989-8565-48-8
Copyright
c
2013 SCITEPRESS (Science and Technology Publications, Lda.)