The Hand-gesture-based Control Interface with Wearable Glove System
Vladislav Berezhnoy, Dmitry Popov, Ilya Afanasyev and Nikolaos Mavridis
Institute of Robotics, Innopolis University, Universitetskaya Str. 1, Innopolis, 420500, Russia
Keywords:
Gesture-based Control Interface, Glove System, Fuzzy C-means (FCM) Clustering, Arduino, V-REP Simula-
tor.
Abstract:
The paper presents an approach to building a gesture-based control interface with a wearable glove system
and a real-time gesture recognition algorithm. The glove-based system is a wireless wearable device with
hardware components, including Arduino Nano controller, IMU and flex sensors, and software for gesture
recognition. Our gesture recognition methodology requires two stages: 1) Building a library of dynamic
gesture models with the reference human gesture graphs; 2) Gesture capturing and evaluating with fuzzy c-
means (FCM) clustering and constructing grammars of gestures by fuzzy membership functions. The system
tests were provided with 6 different dynamic gestures to control position and orientation of a quadcopter in V-
Rep simulator that has demonstrated encouraging results with a reasonable quality of real-time gesture-based
quadcopter control.
1 INTRODUCTION
The gesture-based control interface has a great po-
tential for more natural, intuitively understandable,
customizable and convenient human-machine inter-
action, which can extend capabilities of widespread
graphical and command line interfaces, which we
use nowadays with mouse and keyboard. There-
fore, development of advanced hardware and soft-
ware approaches to hand-gesture recognition is im-
portant for many 3D applications such as control of
computers and robots, interaction with the computer-
generated environment (virtual or augmented real-
ity), sign language understanding, gesture visualiza-
tion, games control, enhancement of communication
ability for disabled people, etc. Many recent studies
discuss various methods to solve gesture recognition
problem of certain gesture classes by computer vision
based systems (Suryanarayan et al., 2010; Suarez and
Murphy, 2012; Rautaray and Agrawal, 2015; Wachs
et al., 2011; Zabulis et al., 2009), wearable sensor-
based systems (Luzhnica et al., 2016; Park et al.,
2015) or even integrated systems, which simultane-
ous use both vision-based devices and wearable sen-
sors (Arkenbout et al., 2015; Mavridis et al., 2012).
The classification of gesture classes depends on ges-
ture difficulties, applications and level of recognition
accuracy (e.g. surgical systems require higher accu-
racy than entertainment or communication applica-
tions) (Suarez and Murphy, 2012; Wachs et al., 2011).
Computer vision-based techniques are one of the most
frequently used approaches, which apply RGB cam-
era and image processing algorithms (Manresa et al.,
2005; Alfimtsev, 2008), Kinect sensors and depth
maps (Suryanarayan et al., 2010; Suarez and Mur-
phy, 2012; Ren et al., 2013; Dominio et al., 2014;
Afanasyev and De Cecco, 2013) and Time-of-Flight
(ToF) cameras (Gudhmundsson et al., 2010) for ges-
ture tracking and hand motion detection. Although
these solutions can be computationally expensive,
they may suffer from lack of robustness in cluttered
background or poor motion scenarios (e.g. by using
just a single gesture). Moreover, they often demon-
strate sensitivity to the environment, illumination con-
ditions, scene, background details and camera param-
eters (resolution, frame rate, distortion, auto-shutter
speed, etc.) that can affect recognition quality (Luzh-
nica et al., 2016; Wachs et al., 2011). Therefore, many
investigations focus also on wearable gesture recog-
nition systems with the ability to track dynamic ges-
tures for the complicated work environment in real-
time with reasonable computational cost and higher
accuracies (Kumar et al., 2012; Luzhnica et al., 2016;
Kenn et al., 2007; Battaglia et al., 2016). The overall
goal of hand gesture recognition is to find similarity
between an unknown performed gesture (called rec-
ognizable model) and a known class of gestures (pat-
terns of gestures). Once the suitable hand gesture fea-
tures have been extracted and a gesture set has been
selected, gesture classification can be accomplished
448
Berezhnoy, V., Popov, D., Afanasyev, I. and Mavridis, N.
The Hand-gesture-based Control Interface with Wearable Glove System.
DOI: 10.5220/0006909304480455
In Proceedings of the 15th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2018) - Volume 2, pages 448-455
ISBN: 978-989-758-321-6
Copyright © 2018 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
by standard machine learning techniques or special-
purpose classifiers (Suarez and Murphy, 2012), which
are frequently based on Neural Networks (Gawande
and Chopde, 2013), Bayesian networks (Suk et al.,
2010), Hidden Markov Models (Bansal et al., 2011),
etc. The main drawback of these methods is high
computational complexity for forming gesture pat-
terns and recognizing dynamic gestures that may limit
their feasibility for real-time applications. Another
approach with an attractive algorithm of dynamic ges-
ture recognition based on fuzzy finite state automata
(for human’s wrist detection from video stream) was
proposed in (Devyatkov and Alfimtsev, 2007; Alfimt-
sev, 2008), and inspired the authors of this paper to
contribute and update this methodology for gesture
recognition with inertial IMU and flex sensors built
in a wearable glove.
In this paper, we focus on the development of a
wearable gesture recognition system, which consists
of sensor-integrated glove hardware and hand gesture
recognition software for human control of computer-
based objects and machines (see, Fig. 1). Our ges-
ture recognition approach comes down to (1) tracking
the trajectories of hand movements along the coor-
dinate axes x(t), y(t) and z(t); (2) building a recog-
nizable model for the gesture G[x(t),y(t),z(t)], using
the tracked trajectories; (3) a comparison of the rec-
ognizable gesture model with the reference gesture
patterns E
i
[x(t), y(t),z(t)] by computing the similar-
ity function C[G, E
i
] to define a relation to the i-th
gesture class. These wearable glove system and hand-
gesture-based software were used for the creation of
control interface to manipulate a quadcopter model in
V-Rep simulator, demonstrating successful real-time
gesture-based control of the quadcopter position and
orientation with 6 different dynamic gestures.
The paper is organized as follows: Section 2
presents our wearable glove system for gesture recog-
nition, Section 3 formalizes the dynamic pattern con-
struction methodology, and Section 4 describes how
we use the FCM clustering algorithm to recognize a
gesture pattern. Finally, we test our approach to build-
ing a gesture-based control interface with the glove
system and a real-time gesture recognition algorithm
in Section 5 and conclude in Section 6.
2 DEVELOPMENT OF A
WEARABLE GLOVE SYSTEM
FOR GESTURE RECOGNITION
Numerous wearable glove-bases systems already ex-
ist (Dipietro et al., 2008). In our work, we have taken
into account existing techniques but utilized a novel
recognition approach.
In our system, to recognize an unknown gesture
the following information is required:
1) Pitch, roll and yaw hand rotations relative to the
surface;
2) Acceleration projections on each coordinate axis;
3) The numerical value of the bending for each finger.
The objectives 1 and 2 can be solved using the
integrated sensor MinIMU-9 v2, which consists of
an accelerometer, magnetometer, and gyroscope, and
measures projections to calculate pitch, roll and yaw.
The objective 3 is reached using the flex sensors.
We organized gesture-based control interface with the
wearable glove-based system, using sensors’ connec-
tion to hardware with the Arduino Nano controller,
Bluetooth (BT) wireless data transmission and Java
application (see, a functional diagram in Fig. 1). The
software executes gesture evaluation with fuzzy c-
means (FCM) clustering algorithm (Bezdek et al.,
1984) and computation of a similarity function be-
tween recognizable gesture models and dynamical
reference gesture patterns from gestures’ library. This
glove-based system was tested by controlling an Un-
manned Aerial Vehicle (UAV) in V-Rep simulator.
The hardware components of the wearable glove sys-
tem are shown in Fig. 2.
3 CONSTRUCTION OF DYNAMIC
GESTURE PATTERNS
The method, which constructs dynamic gesture pat-
terns performed by a hand motion, consists of two
stages: 1) Capturing and tracking regions of interest
O
b
[x(t), y(t)], containing x(t) and y(t) trajectory pro-
jections of hand motion on the coordinate axes over
time; 2) Building a pattern E
i
[x(t), y(t)] and recog-
nizable gesture model G[x(t),y(t)], using the tracked
trajectories. When performing a gesture for recog-
nition, it is enough to execute it once. But for the
construction of the reference gesture models (classes)
the same gesture needs to be performed repeatedly.
The motion trajectories of each repeated gesture, of
course, are not the same. For example, each hand
motion trajectory (x(t),y(t)), drawing the letter "N",
will form a set of trajectories (x
j
(t),y
j
(t)) included in
certain boundaries at the n-fold repetition of gesture
j = 1, 2, ...,n, which are a gesture pattern E with own
characteristic features, as shown in Fig. 3.
The Stage 2 of our approach is related to the
methodology of dynamic gesture recognition based
on fuzzy finite state automata proposed by (Devy-
atkov and Alfimtsev, 2007). As far as this methodol-
The Hand-gesture-based Control Interface with Wearable Glove System
449
Figure 1: The functional diagram of wearable glove-system for gesture-based control of UAV.
Figure 2: The hardware components of the glove system.
ogy was published in Russian, let us describe the main
principles for the recognition of the gesture, drawing
the letter "N". The generalization of gesture patterns
E for different hand trajectories in the form of letter
"N" can be represented as a graph shown in Fig. 3.
According to this graph, vertex I includes a set of
points (coordinates) that belongs to the beginning of
the gesture, vertices II and III correspond to the points
Figure 3: The wearable glove system with the dynamic ges-
ture graph in the form of letter "N".
of the trajectory bending, vertex IV contains the end
points of the trajectory. It serves as a basis for pattern
gesture construction.
The first task of gesture pattern construction
E
i
according to trajectories (x
j
(t),y
j
(t)) is an al-
gorithmic definition of movement trajectory points
(x
j
(t
m
),y
j
(t
m
)) at time t
m
, which correspond to the
ICINCO 2018 - 15th International Conference on Informatics in Control, Automation and Robotics
450
m-th vertex of the graph. To solve this task, we used
fuzzy c-means (FCM) clustering algorithm (Bezdek
et al., 1984; Nayak et al., 2015).
Let us remind the main definitions for c-means
clustering. The set of points for the motion trajectory
relative to one vertex is called a cluster. The num-
ber of measurement points is denoted as N. Each c-th
cluster includes a subset of the values of the charac-
teristic features of the vectors p
k
= [p
k1
, p
k2
,..., p
km
],
where k = 1, ..., N - is the total number of points, m
- number of features. For the considered gesture, m
= 4. The features of each k-th point are based on the
trajectory x(t) and y(t) at the corresponding time t
k
.
p
k1
= x(t
k
),
p
k2
=
dx(t)
dt
|
t=t
k
,
p
k3
= y(t
k
),
p
k4
=
dy(t)
dt
|
t=t
k
(1)
The clustering algorithm c-means is based on a
method for minimizing an objective function, which
should be created such way to: (1) Minimize a dis-
tance between a cluster’s point and the cluster center;
(2) Maximize a distance between clusters’ centers.
Although the approach of applying FCM cluster-
ing algorithm to gesture classification is used long
ago (Li, 2003; Wachs et al., 2002), we used the cri-
terion known as the sum of squared errors within a
class, which uses the Euclidean norm to describe the
distance between vectors d
ik
3 presented in (Devy-
atkov and Alfimtsev, 2007). This criterion is denoted
J(u, v), where u is a partition of all points in clusters,
and v is a vector of cluster centers, which corresponds
to the partition u. The formula of the criterion (the
objective function) would be the following:
J(u, v) =
N
k=1
c
i=1
u
ik
d
2
ik
(2)
where d
ik
- is a measure in Euclidean n-dimensional
feature space between the k-th m-dimensional vector
p
k
and the i-th cluster center v
i
, which is calculated
by using the formula:
d
ik
= |p
k
v
i
| =
N
k=1
(p
k j
v
i j
)
2
1/2
(3)
The coordinates of the cluster centers v
i
=
{v
i1
,v
i2
,...,v
im
} are calculated by the formula:
v
i j
=
N
k=1
u
ik
p
k j
N
k=1
u
ik
(4)
where u
ik
- the characteristic function, A
i
- i-th cluster,
i = 1, 2, ...,C:
u
ik
=
1, i f p
k
A
i
,
0, i f p
k
/ A
i
.
(5)
Figure 4: Graph of the gesture in the form of letter "N".
It is required to find the optimal partition u
into
clusters with centers v
, for which the objective func-
tion value is minimal:
J(u
,v
) = min
u,v
J(u, v)
uM
c
(6)
where M
c
- the set of all different partitions into C
clusters.
We use a strategy of c-means clustering algorithm,
which is known as iterative optimization that includes
the following steps (Devyatkov and Alfimtsev, 2007):
1. Fix the number of clusters C(2 < C < N) and se-
lect the primary partition for the set of trajectories
of points on the A
i
clusters. Then, perform the
following steps for r = 0, 1, 2, ...
2. Compute the centers v
(r)
i
of all clusters defined by
partition u
(r)
.
3. Calculate the new features for all i and then k:
u
(r+1)
ik
=
1, i f argmin
k
d
r
ik
i=1,N
,
0, otherwise.
(7)
4. Build a new partition of u
(r+1)
.
5. If u
(r+1)
= u
(r)
then stop the process, considering
the partition of u
(r+1)
as optimal. Otherwise, sup-
pose r = r + 1 and go to the step 2.
This strategy (Devyatkov and Alfimtsev, 2007) al-
lows to determine models of all reference gestures,
which can be presented with graphs, where points of
trajectories (x
j
(t),y
j
(t)) can be assigned to each ver-
tex of the graph for dynamic gestures. It gives pattern
gestures, where graph vertices correspond to clusters
with their centers and edges relate to trajectory direc-
tion. The example of gesture graph, which has the
shape of letter N”, is shown in Fig. 4, where vertices
correspond to cluster A
1
,A
2
,A
3
and A
4
. The coordi-
nates of the cluster centers are shown on the axes.
The Hand-gesture-based Control Interface with Wearable Glove System
451
Figure 5: Gesture graph projections of the graph shown in
the Fig. 4.
4 GESTURE RECOGNITION
WITH FUZZY FINITE
AUTOMATA AND GRAMMARS
Gesture graph in Fig. 4 does not contain information
about the movement of the centers of clusters over
time. To consider gesture over time, a model of dy-
namic gestures based on fuzzy finite automata should
be build (Devyatkov and Alfimtsev, 2007). To do this,
according to the gesture graph shown in Fig. 4, two
graphs shown in Fig. 5 can be constructed. These
graphs are obtained by moving the projection trajec-
tories hands on time axis and the abscissa axis and
also on the ordinate axis and the time axis.
The basic principles of automation grammar gen-
eration for gesture recognition is well presented in
(Alfimtsev, 2008). Since it was published in Rus-
sian, let us describe the main statements. When
considering one sample of a gesture projections
y
i
(t), the sequence of n + 1 samples Y
i
[t
0
,t
n
] =
{y
i
(t
0
),y
i
(t
1
),y
i
(t
2
),...,y
i
(t
n
)} of i-th projection of
the same graph gesture for several consecutive
time points t
0
,t
1
,...,t
n
(for the time interval [t
0
,t
n
]))
is called a signal. The set of samples K(t) =
{y
1
(t),y
2
(t),..,y
m
(t)}, where m - number of differ-
ent projections of the same gesture graph at time
t is called the reaction. The reaction sequence
K(t
0
),K(t
1
),...,K(t
n
), obtained by m projections of
same gesture for several consecutive moments of time
t
0
,t
1
,t
2
,...,t
n
(for a time interval [t
0
,t
n
]) is called a
flow of reactions. Each sample y
j
(t
i
) of the same
signal corresponds the condition b
j
(t
i
) of the finite
automaton M
j
. Then it can be introduced a func-
tion of outputs ϕ(b
j
(t
i
)) = y
j
(t
i
) and the function
of automaton transitions f (b
j
(t
i
),t
i+1
) = b j(t
i+1
) for
a finite automaton M
j
. Thus, each sample is a
value of output function y
j
(t) = ϕ(b
j
(t)) of automa-
ton M
j
; each signal is a sequence of values of out-
put functions y
j
(t) = {y
j
(t
0
),y
j
(t
1
),...,y
j
(t
n
)} of the
same automaton M
j
; each reaction is the set y(t) =
{y
1
(t),...,y
m
(t)} values of output functions of differ-
ent automaton M
1
,M
2
,...,M
m
, and reaction flux is a
sequence of y(t
0
),y(t
1
),...,y(t
n
). Therefore, any au-
tomaton M
j
corresponding to a projection of a ges-
ture graph can be represented by its transition graph,
where each graph vertex is marked with the symbol
b
i
, each pair of adjacent vertices (b
i
,b
i+1
), and edge
directed from vertex i to vertex i + 1 is marked with
the symbol t
i
in the alphabet T = {t
0
,t
1
,t
2
,...,t
m1
}.
If write down all edges, the result will conclude a
sequence of characters t
1
t
2
...t
m1
Λ (where Λ is an
empty symbol, which may be omitted). This sequence
can be considered as a word or a sentence of the lan-
guage L = L(G), generated by an automaton grammar
G = {V, T,P,S = b
0
}, where V = {b
1
b
2
...b
m1
}, T =
{t
1
t
2
...t
m1
Λ}, P = {b
0
t
1
b
1
, b
1
t
2
b
2
,... b
m2
t
m1
b
m1
, b
m1
t
m
b
m
, b
m
Λ }.
In the ideal case for gesture recognition a set of
automatons M
1
,M
2
,...,M
m
can be constructed such
way that each automaton corresponds to one of the
distinct grammars. Then the language correspond-
ing to this automation could be unambiguously de-
tected by the automaton grammar. However, in re-
ality such an ideal situation is unattainable, because
the person can not perform every new gesture in ab-
solutely same way. Therefore, to cope with the uncer-
tainty that arises when performing gestures, we need
to move from deterministic to fuzzy automata. To do
this, we construct for each distinct grammar G the cor-
responding fuzzy grammar G
F
, based on the princi-
ples described in (Alfimtsev, 2008).
Each edge of the finite deterministic automaton
corresponds two incident vertices b
i
and b
i+1
with
the vertices’ coordinates [t
i
,ϕ(b
i
(t
i
)) = y
i
(t
i
)] and
[t
i+1
,ϕ(b
i+1
(t
i+1
)) = y
i+1
(t
i+1
)] respectively. Follow-
ing (Alfimtsev, 2008), we will assume that the sam-
ples y
i
(t
i
) of the same cluster corresponding to l dif-
ferent trajectories of the same gesture may vary within
the standard deviation of the projection from the clus-
ter center v
i
(t
i
):
s
i
=
v
u
u
u
t
N
l=1
y
l
i
(t
i
) v
i
(t
i
)
2
N
, (8)
where N number of samples belonging to the clus-
ter, v
i
– the coordinate of the center of the i-th cluster,
y
l
i
(t
i
) sample, belonging to i-th cluster. For sim-
plicity, assume that s
i
is the same for all i and equal
to s. For each set of samples y
l
i
(t
i
) we set the trian-
gular membership function µ
i
(y), defined by points
y
i
= v
i
s, y
i
= v
i
, y
+
i
= v
i
+ s, where µ
i
(y
i
) =
0,µ
i
(y
i
) = 1 and µ
i
(y
+
i
) = 0.
The vertex b
i
with coordinates (t
i
,y
i
) is replaced
by set of vertex b
ri
B(b
i
) with coordinates, chang-
ing within the interval (y
i
= y
i
s, y
+
i
= y
i
s). Each
vertex b
ri
corresponds to a specific coordinate, and
ICINCO 2018 - 15th International Conference on Informatics in Control, Automation and Robotics
452
the set B(b
i
) is calculated as a set of vertices of all the
coordinates for which y
l
i
(t
i
) > 0. Then, instead of ver-
tex b
i+1
with the coordinates (t
i+1
,y
i+1
) we will have
a set of vertices b
r(i+1)
B(b
i+1
) with coordinates,
changing within the interval (y
i+1
,y
+
i+1
), and instead
of one edge t
i+1
(from vertex b
i
to vertex b
i+1
) we
will have a plurality of all edges {(b
ri
,b
r(i+1)
)|b
ri
B(b
i
),b
r(i+l)
B(b
i+1
)} joining each vertex of the set
B(b
i
) to each vertex of the set B(b
i+1
).
More detailed description about how to generate
grammars for both finite deterministic automa-
ton and fuzzy finite automata for hand gesture
recognition is presented in (Alfimtsev, 2008).
Thus, two triangular membership functions: µ
i
(y)
and µ
i+1
(y) are defined by triplets at the points
{y
i
,y
i
,y
+
i
}and{y
i+1
,y
i+1
,y
+
i+1
} correspondingly.
Each of these functions is determined by the
following expression:
µ(y) =
yy
k
y
k
y
k
, i f y
k
y y
k
,
y
+
k
y
y
+
k
y
k
, i f y
k
< y y
+
k
.
(9)
where, k (i, i + 1). These functions define the mea-
sure of closeness to vertex coordinates to the "ideal
coordinates", which correspond to the value of the
membership function, equal to 1. It is assumed that
the membership function of each edge (b
ri
,b
r(i+1)
) of
incident vertices b
ri
B(b
i
) and b
r(i+l)
B(b
i+1
) is
defined as:
µ
(b
ri
,b
r(i+1)
)
(t
i+1
) = min{µ
i
(y
ri
),µ
i+1
(y
r(i+1)
)}. (10)
Fuzzy grammar G
F
obtained from the regular
grammar G (Alfimtsev, 2008). The set of the rules
P
F
for the fuzzy grammar G
F
will be the following:
P
F
= {b
ri
t
i+1
b
r(i+1)
,
µ(t
i+1
b
r(i+1)
) = µ
(b
ri
,b
r(i+1)
)
,
i = 0, ..., n 1}.
(11)
According to (Alfimtsev, 2008), the grammar G
with rules {b
ri
t
i+1
b
r(i+1)
, i = 0,...,n 1} is com-
parable to the fuzzy grammar G
F
, if there is a se-
quence of fuzzy rules {µ
(b
ri
,b
r(i+1)
)
, i = 0,...,n 1},
which takes place b
i
= b
ri
for all i = 0, ..., n 1.
Thus, the dynamic gesture recognition algorithm,
which uses a model based on fuzzy finite automata
and the corresponding set of reference fuzzy gram-
mars G
F1
,G
F2
,...,G
Fm
will contain the following
steps (Alfimtsev, 2008):
1. A gesture is treated with the same sampling
steps along the time axis as the reference ges-
tures, and with construction of a set of grammar
G
F1
,G
F2
,...,G
Fm
.
Figure 6: Gestures to control a quadcopter in V-Rep simu-
lator: (a) move forward, (b) move backward, (c) move up,
(d) move down, (e) turn right, (f) turn left.
2. The grammars G
F1
,G
F2
,...,G
Fm
correspond-
ing recognizable gesture with each correspond-
ing fuzzy reference grammar G
k
F1
,G
k
F2
,...,G
k
Fm
,
where k {1, ..., K}, and K is a number of rec-
ognizable gestures, are compared.
3. For the sets of fuzzy reference grammars
G
k
F1
,G
k
F2
,...,G
k
Fm
, where comparison was
successful, it is calculated the correspond-
ing set of values for membership functions
µ
G
1
,G
k
F1
,µ
G
2
,G
k
F2
,...,µ
G
m
,G
k
Fm
according to the
formula 10, and then the value of the measure
A
k
, which characterizes the similarity of the
recognizable gesture to the reference gestures k,
by the formula:
A(G,G
k
) = A
k
=
max{µ
G
1
,G
k
F1
,µ
G
2
,G
k
F2
,...,µ
G
m
,G
k
Fm
},
(12)
4. The recognizable gesture is considered coincident
with the pattern gesture k, for which the measure
value A
k
was maximal. If there is no successful
comparison of grammars, then recognition of this
gesture fails (i.e. the gesture was not recognized).
The Hand-gesture-based Control Interface with Wearable Glove System
453
(a) The quadcopter in V-Rep environment.
(b) Control with a glove system (exp, ).
Figure 7: Experiments with hand-gesture control of the
quadcopter with the glove system in V-Rep simulator.
5 TESTS OF THE
HAND-GESTURE CONTROL
INTERFACE WITH THE
GLOVE SYSTEM
After glove-based system hardware and software im-
plementation, it was conducted experiments to un-
derstand the recognition ability. To test the system’s
recognition rate and demonstrate the advantages of
gesture-based control, we decided to provide exper-
iments to control UAV (in our case, the simulated
quadcopter, Fig. 7a) with this glove system. To con-
trol position and orientation of the quadcopter in V-
Rep simulator environment we selected 6 gestures
(Fig. 6): move forward and backward, move up and
down, turn left and right. The Fig. 7b shows an op-
erator at the moment of the quadcopter control by
the hand gestures during its flight in a V-Rep maze.
The video with this experiment with the gesture con-
trol of the quadcopter with the wearable glove-based
system in V-Rep simulator is achievable on YouTube
(exp, ). Before the UAV flight, the operator calibrates
the glove system, recording in software the main ges-
tures, which will be used for the control. By using
only a set of predefined motions user could easily nav-
igate UAV through the V-Rep maze. But it should
be noticed that in this scenario the operator can only
control the desired drone navigation points, whereas
the built-in algorithms of V-Rep quadcopter control
finds an optimal way to reach them. The tests show
that the implemented wearable glove system for hand-
gesture-based control is intuitive, easily adjustable
and customizable through personal gesture library.
6 CONCLUSIONS AND FUTURE
WORK
In this paper, we present glove-based system hard-
ware and software implementation, which organize
the user-friendly hand-gesture-based control interface
based on fuzzy finite state automata gesture recogni-
tion methods. These wearable glove system (with in-
ertial IMU and flex sensors) and gesture recognition
methodology were used to manipulate a quadcopter
model in V-Rep simulator, demonstrating successful
real-time gesture-based control of the quadcopter po-
sition and orientation with 6 different dynamic ges-
tures. This methodology is based on fuzzy c-means
(FCM) clustering algorithm with the sum of squared
errors measure criterion (Devyatkov and Alfimtsev,
2007), which minimizes a distance between a clus-
ter’s point and the cluster center, and maximizes a dis-
tance between clusters’ centers. These glove-based
system’ hardware and software can also be used in
various applications to create human-computer inter-
faces. The software solution can be of interest in the
situations, for instance in a teaching environment, to
check the correctness of gestures, in case where ges-
tures are many and the trajectory is important as much
as the final position.
The methodology for gesture recognition based on
fuzzy finite automata and grammars has many advan-
tages and capabilities, for example: (1) Automatic
creation of a pattern gesture model for each dynamic
gesture. (2) Dynamic gesture model training with a
small training set (only a few examples). (3) Re-
liable recognition of dynamic gesture trajectories in
real-time, including occlusions and intersections. (4)
Computational efficiency for such models. Fuzzy
model has the computational complexity of O(mn),
where m - the number of fuzzy automata used for
recognition, the n - the maximum number of states
fuzzy finite automaton.
The future work may include extension of control
interface capabilities to manipulation of larger num-
ber of robots, additional testing with a real quadcopter
and comparison with alternative methods to clarify
the learning time, control accuracy, etc.
ICINCO 2018 - 15th International Conference on Informatics in Control, Automation and Robotics
454
ACKNOWLEDGEMENTS
This research has been supported by the grant of Rus-
sian Ministry of Education and Science, agreement:
No14.606.21.0007, ID: RFMEFI60617X0007, "An-
droid Technics" company and Innopolis University.
REFERENCES
The tests of the hand-gesture-based control interface
with wearable glove system. Innopolis University:
https://youtu.be/f9dfKCUuUvY.
Afanasyev, I. and De Cecco, M. (2013). 3d gesture recogni-
tion by superquadrics. In International Conference on
Computer Vision Theory and Applications (VISAPP),
volume 2, pages 429–433. INSTICC.
Alfimtsev, A. (2008). Research and development of meth-
ods for dynamic gesture capture, tracking and recog-
nition. PhD thesis, Bauman Moscow State Technical
University.
Arkenbout, E. A., de Winter, J. C., and Breedveld, P. (2015).
Robust hand motion tracking through data fusion of
5dt data glove and nimble vr kinect camera measure-
ments. Sensors, 15(12):31644–31671.
Bansal, M., Saxena, S., Desale, D., and Jadhav, D. (2011).
Dynamic gesture recognition using hidden markov
model in static background. International Journal of
Computer Science Issues (IJCSI), 8(1).
Battaglia, E., Bianchi, M., Altobelli, A., Grioli, G., Cata-
lano, M. G., Serio, A., Santello, M., and Bicchi, A.
(2016). Thimblesense: a fingertip-wearable tactile
sensor for grasp analysis. IEEE Transactions on Hap-
tics, 9(1):121–133.
Bezdek, J. C., Ehrlich, R., and Full, W. (1984). Fcm: The
fuzzy c-means clustering algorithm. Computers &
Geosciences, 10(2-3):191–203.
Devyatkov, V. and Alfimtsev, A. (2007). Recognition of ma-
nipulative gestures. Bulletin of the Bauman Moscow
State Technical University. Series of «Instrument mak-
ing». In Russian., (3):56–74.
Dipietro, L., Sabatini, A. M., and Dario, P. (2008). A survey
of glove-based systems and their applications. IEEE
Transactions on Systems, Man, and Cybernetics, Part
C (Applications and Reviews), 38(4):461–482.
Dominio, F., Donadeo, M., and Zanuttigh, P. (2014). Com-
bining multiple depth-based descriptors for hand ges-
ture recognition. Pattern Recognition Letters, 50:101–
111.
Gawande, S. D. and Chopde, N. R. (2013). Neural network
based hand gesture recognition. International Journal
of Emerging Research in Management and Technol-
ogy, 3:2278–9359.
Gudhmundsson, S. A., Sveinsson, J. R., Pardas, M.,
Aanaes, H., and Larsen, R. (2010). Model-based hand
gesture tracking in tof image sequences. In Inter-
national Conference on Articulated Motion and De-
formable Objects (AMDO), pages 118–127. Springer.
Kenn, H., Van Megen, F., and Sugar, R. (2007). A glove-
based gesture interface for wearable computing appli-
cations. In 4th International Forum on Applied Wear-
able Computing (IFAWC), pages 1–10. VDE.
Kumar, P., Verma, J., and Prasad, S. (2012). Hand
data glove: A wearable real-time device for human-
computer interaction. International Journal of Ad-
vanced Science and Technology, 43:15–26.
Li, X. (2003). Gesture recognition based on fuzzy c-means
clustering algorithm. Department Of Computer Sci-
ence The University Of Tennessee Knoxville.
Luzhnica, G., Simon, J., Lex, E., and Pammer, V. (2016).
A sliding window approach to natural hand gesture
recognition using a custom data glove. In IEEE Sym-
posium on 3D User Interfaces (3DUI), pages 81–90.
Manresa, C., Varona, J., Mas, R., and Perales, F. J. (2005).
Hand tracking and gesture recognition for human-
computer interaction. ELCVIA Electronic Letters on
Computer Vision and Image Analysis, 5(3):96–104.
Mavridis, N., Giakoumidis, N., and Machado, E. L. (2012).
A novel evaluation framework for teleoperation and
a case study on natural human-arm-imitation through
motion capture. International Journal of Social
Robotics, 4(1):5–18.
Nayak, J., Naik, B., and Behera, H. (2015). Fuzzy c-
means (fcm) clustering algorithm: a decade review
from 2000 to 2014. In Computational intelligence in
data mining-volume 2, pages 133–149. Springer.
Park, Y., Lee, J., and Bae, J. (2015). Development of a wear-
able sensing glove for measuring the motion of fingers
using linear potentiometers and flexible wires. IEEE
Trans. on Industrial Informatics, 11(1):198–206.
Rautaray, S. S. and Agrawal, A. (2015). Vision based hand
gesture recognition for human computer interaction: a
survey. Artificial Intelligence Review, 43(1):1–54.
Ren, Z., Yuan, J., Meng, J., and Zhang, Z. (2013). Robust
part-based hand gesture recognition using kinect sen-
sor. IEEE Trans. on Multimedia, 15(5):1110–1120.
Suarez, J. and Murphy, R. R. (2012). Hand gesture recogni-
tion with depth images: A review. In 21st IEEE Inter-
national Symposium on Robot and Human Interactive
Communication (RO-MAN), pages 411–417. IEEE.
Suk, H.-I., Sin, B.-K., and Lee, S.-W. (2010). Hand ges-
ture recognition based on dynamic bayesian network
framework. Pattern recognition, 43(9):3059–3072.
Suryanarayan, P., Subramanian, A., and Mandalapu, D.
(2010). Dynamic hand pose recognition using depth
data. In Pattern Recognition (ICPR), 20th Interna-
tional Conference on, pages 3105–3108. IEEE.
Wachs, J., Kartoun, U., Stern, H., and Edan, Y. (2002).
Real-time hand gesture telerobotic system using fuzzy
c-means clustering. In Automation Congress, 2002
Proceedings of the 5th Biannual World, volume 13,
pages 403–409. IEEE.
Wachs, J. P., Kölsch, M., Stern, H., and Edan, Y. (2011).
Vision-based hand-gesture applications. Communica-
tions of the ACM, 54(2):60–71.
Zabulis, X., Baltzakis, H., and Argyros, A. (2009). Vision-
based hand gesture recognition for human-computer
interaction. 34:30.
The Hand-gesture-based Control Interface with Wearable Glove System
455