could be gained from characteristics of the dynamic
behaviour. With this in mind, we propose a hierar-
chical dynamical model which takes into account two
levels of dynamics: inter-activity and intra-activity.
The model aims to represent the activities as intu-
itively as possible in terms of the patterns present in
the raw data from the sensors. Thus, not only are dif-
ferent activities recognized, but the “events” within a
given activity are also distinguished, for example, the
steps in the case of walking. Three different dynamic
models are described, each one pertaining to a partic-
ular type of activity: the first is for stationary activi-
ties like standing, sitting and lying; the second, for ac-
tive movements like walking and running, whilst the
third deals with short-time motions like jumping and
falling.
A further advantage of the proposed system is
that it uses raw signals directly from the sensor, thus
avoiding computationally expensive techniques such
as feature extraction and selection. Because the sys-
tem is designed to capture directly the dynamics of
the signals, activity recognition is achieved with high
accuracy whilst eliminating costly processing tech-
niques.
The paper is organized as follows: in Section 2
the activity recognition literature is reviewed. Section
3 describes the proposed hierarchical dynamic model.
The test procedure is outlined in Section 4, whilst in
Section 5 the results obtained with our model are pre-
sented. Section 6 provides a discussion of the oper-
ation of the proposed method. Finally, in Section 7,
conclusions and future lines of work are discussed.
2 BACKGROUND AND RELATED
WORK
2.1 Sensors and Feature Extraction
The previously published literature in the area of hu-
man activity recognition using inertial sensors is quite
extensive. Most of the published work follows a simi-
lar approach of data collection and processing, as out-
lined in this section.
Perhaps the first consideration in any activity
recognition system, is the selection of the type and the
number of sensors, as well as the positions on the hu-
man body where they will be worn. The simplest sen-
sor used in the recent literature is a triaxial accelerom-
eter (Han et al., 2010; Krishnan et al., 2008; He and
Jin, 2008; Khan et al., 2010). In (Frank et al., 2010;
Altun and Barshan, 2010; Zhu and Sheng, 2010), in-
ertial measurement units (IMU), combining triaxial
accelerometers and triaxial gyroscopes, are used to
provide measurements of specific force and angular
rate, respectively. As has been previously mentioned,
the larger the number of sensors used, the more activi-
ties the system can recognize. Similarly, the choice of
sensor positions on the body is crucial. In the case of
a single sensor, the most popular place is the waist, on
the belt or in the pocket of the trousers (Frank et al.,
2010; Han et al., 2010; He and Jin, 2008). In this
work, a single IMU placed on either the left or right
hip is considered for testing purposes, although the
model is not limited to this configuration.
The first processing step is, typically, focused on
the construction of a feature vector derived from the
raw signals of the sensor. In the literature, a large
number of different features have been reported as
being suitable for the classification task considered in
this work; (Preece et al., 2009) provides a comparison
of the most popular features. A common approach
is to extract many features (for example in (Krishnan
et al., 2008) thirty-nine features are extracted); then,
dimensionality reduction techniques such as Principal
Component Analysis (PCA) or Linear Discriminant
Analysis (LDA) are used to reduce the size of the fea-
ture vector before classification.
In addition to the processing required for feature
extraction and selection, another disadvantage of this
approach is that a predefined window length must be
determined to compute the features. Furthermore, an
overlap is often used between consecutive windows.
The selection of such parameters is somewhat arbi-
trary and there is a lack of agreement on the best
choice; in the literature, the window length varies
widely (e.g. from 16 msec (Han et al., 2010) to 6 sec
(Bao and Intille, 2004)), whilst a 50% overlap is com-
mon.
Once the feature vector has been computed from
the windowed signals, the next step is the develop-
ment of a model that is able to discriminate among
activities. The most popular methods that have been
used to solve this sequential supervised learning prob-
lem are batch supervised learning algorithms and Dy-
namic Bayesian Networks (DBN).
In (Altun and Barshan, 2010), a comparison of
classification results using various batch supervised
learning algorithms, including Bayesian Decision
Making (DBM), Least-Squares Method (LSM), k-
Nearest Neighbor (k-NN), Support Vector Machines
(SVM) and Artificial Neural Networks (ANN) can be
found. Batch supervised learning algorithms, which
ignore the dynamics of the signals, are not consid-
ered in this work. One reason for this is to bypass
the feature extraction step and, furthermore, it will be
seen that consideration the dynamics of the signals
BIOSIGNALS 2012 - International Conference on Bio-inspired Systems and Signal Processing
62