(a) Target Image. (b) Background Image.
(c) Division of Gesture re-
gion.
(d) Gesture region.
Figure 3: Different Stages of Segmentation Algorithm.
obtain these motion features of a dynamic hand ges-
ture video. It is the process of estimating the trajec-
tory of hand, as the hand moves in the area of interest.
The tracking algorithm has to estimate the state of the
system at any time instant, given a set of observations.
2.3.1 Spatial Features using MPEG-7 Shape
Descriptors (Azhar and Amer, 2008)
For spatial information, MPEG-7 ART shape descrip-
tors are used. The visual part of MPEG-7 standard
defines three descriptors with different properties: re-
gion based, contour based and 3D spectrum shape de-
scriptors.
Region based shape descriptors are used here.
These descriptors take into account all pixels consti-
tuting the shape, boundary and inner pixels. Con-
ceptually, a descriptor works by decomposing the
shape into a number of orthogonal 2-D basis func-
tions, defined by Angular Radial Transform (ART).
The MPEG-7 ART descriptor employs a complex An-
gular Radial Transformation defined on a unit disk in
polar coordinates.
2.3.2 Tracking of Hand by Particle Filters
(Arulampalam et al., 2002)
Tracking objects efficiently and robustly in complex
environment is a challenging issue in computer vi-
sion. Often dynamic image frames of these hand
regions are tracked to generate suitable features.
Particle-filtering-based tracking and its applications
in gesture recognition systems became popular very
recently. The key idea is to represent probability den-
sities by a set of samples. As a result, it has the
ability to represent a wide range of probability den-
sities, allowing real-time estimation of nonlinear,non-
Gaussian dynamic systems.
2.4 Classification and Recognition
Spatial feature vectors and features obtained from
gesture trajectory are given as input to the neural net-
work classifier to classify different dynamic gestures.
Radial basis function (RBFs) neural network(Michie
et al., 1974) is used for classification. A total of 20
dynamic hand gestures have been chosen for our ex-
periments. Among them, 8 belong to ASL gestures
and 12 belong to control commands.
A radial basis function network is an artificial neu-
ral network that uses radial basis function as the ac-
tivation function. Radial basis function (RBF) net-
works typically have three layers: an input layer, a
hidden layer with RBF activation function and an out-
put layer. The hidden units provide a set of functions
that constitute an arbitrary basis for the input patterns
(vectors) when they are expanded into the hidden-unit
space; these functions are called radial-basis func-
tions. Different types of radial basis functions could
be used, but the most common is the Gaussian func-
tion.
3 SIMULATION RESULTS
In this paper we discussed the dynamic hand gesture
recognition system using radial basis function neu-
ral network. A total of twenty dynamic hand ges-
tures have been chosen for our experiments. The
complete system works at a frame rate of about 25
frames/s. 20 dynamic gestures are captured by using
Panasonic handycam having 3.1 mega pixel picture
quality. Among them, 8 are ASL Gestures and 12 are
control commands as shown in Figure 2. The hand re-
gion is extracted by using background subtraction al-
gorithm. One major advantage of this method is that
it extracts the gesture region without future frame im-
ages. Spatial features are extracted by using MPEG-
7 shape descriptors. For temporal information, the
hand is tracked by using particle filter. Trajectory fea-
tures are extracted only for key points. Spatial and
trajectory features are combined and given as inputs
to the classifier. Radial basis function neural network
is used as classifier. 8 ASL gestures are used for test-
ing. Table 1 presents the results of different dynamic
gestures using RBFs. Table 2 presents the results of
8 ASL gesture recognition using RBFs. Recognition
rates ranged from 80% to 98%.
DYNAMIC HAND GESTURE RECOGNITION SYSTEM USING NEURAL NETWORK
255