It is natural that the accuracy of such description
of a human body or palm that uses generalized
cylinders is very low. Therefore, the proposed
approach cannot be used for the high-precision
reconstruction of shape and surface of 3D objects,
as, for instance, in (German Cheung Baker, 2003).
But for the recognition of gestures or poses the high
accuracy of the description of shapes and surfaces is
not required. It is sufficient to recognise only
substantial changes in the shape of these objects,
which characterise gestures. This approach makes it
possible to obtain solution of the problem with the
use of simple and inexpensive equipment under the
normal conditions.
2 THE PROPOSED METHOD
The proposed approach is based on the revealing the
of symmetry axes of the locally symmetric objects.
Although, these axes are invisible on the stereo mate
images, they still can be calculated for each image
by processing a silhouette presented on it.
We assume that the observed object does not
have occlusion. This means that all elements of the
object, for example, the fingers of a palm are visible
in the silhouette image. For objects with occlusions
it is intended to use a sequential segmentation of
initial grey scaled image to reveal overlapped parts.
Considering the silhouettes of stereo mate
images as projections of the spatial fat curves onto
the corresponding planes, we can expect that the
projections of the axes of the fat curves coincide
with the middle axes of the silhouettes.
In reality, the silhouette of a sphere is an ellipsis.
For the simplified case, when a radius of a sphere is
constant, there is a precise method of restoration
based on one silhouette image analysis
(Caglioti, 2006). For the images which we deal with,
the difference between this ellipsis and a circle is so
small, that it can be neglected.
We shall consider some (invisible) points which
are not the boundary points of the silhouettes as the
common matching points of stereo mate images.
Such reference points are provided by middle axes
of the silhouettes constituting its skeleton.
Implementation of the proposed approach poses
several problems. First we need to build the
skeletons of the silhouettes in a way that allows
identification of the points of different skeletons.
Then we have to restore the spatial form of the
whole object using the results of the identification of
the pair of skeletons. It is worth mentioning that all
calculations should be performed in the framework
of the computer vision system in real time which
requires processing of several stereo mate images
per second. This demands developing highly
efficient computational algorithms.
The notion of a flat flexible object is introduced
in (Mestetskiy, 2007) and an effective method of
comparing flexible objects on the basis of a
boundary-skeletal model is proposed. In the present
paper, we propose a generalisation of the notion of a
flat flexible object to the spatial case.
We define spatial flexible object as a set of
spheres of various sizes with centres on a spatial
tree. Stereo mate image processing allows
reconstructing the spatial structure of the object.
Reconstructing spatial characteristics of the
object allows monitoring the displacement dynamics
of the elements constituting the object, as well as the
changes in the object's shape. Applied to the human
palm or body this allows tracking their gestures or
movements.
Implementation of the proposed approach
includes solving of several subtasks.
2.1 Silhouette Acquisition
It is assumed that there is a pair of video cameras
which allows receiving synchronized images of an
object. An example of such stereo mate images is
presented on fig.1. In our experiments the standard
web-cameras connected to the desktop computer
were used. Each image is separately segmented, then
a silhouette is extracted and represented as a binary
raster image. There are different ways of
segmentation. All of them depend on specific
applications. One can note that in gestures
recognition the requirements to the quality of
silhouette images are not very demanding. Figure 2
shows the result of palm segmentation obtained
using the background subtraction method. In this
example a simple method of background subtraction
and thresholding was used.
Figure 1: The palm stereo mate.
2.2 Continuous Skeleton
Construction of the silhouette skeletons (fig. 3) is
VISAPP 2009 - International Conference on Computer Vision Theory and Applications
444