start the recognition task with a small set of sensors
and then interactively send queries for more features,
as needed. In this way, we can afford to run the activ-
ity recognition engine on a low-powered device with-
out sacrificing the accuracy. We present empirical re-
sults on real data, which illustrate the utility of this
approach.
2 METHODOLOGY AND DATA
The data set used for the feature selection experi-
ment was collected by Dieter Fox (Subramanya and
Raj, 2006), using the Intel Mobile Sensing Platform
(MSP (Choudhury et al., 2008) that contains several
sensors, including 3-axis accelerometer, 3-axis gyro-
scope, visible light photo transistor, barometer, and
humidity and ambient light sensors. Six participants
wore the MSP units on a belt at the side of their
hip and were asked to perform six different activities
(walking, lingering, running, climbing upstairs, walk-
ing downstairs and riding a vehicle) over a period of
three weeks. Ground truth was acquired through ob-
servers who marked the start and end points of the
activities. The working data set was 50 hours of la-
belled data (excluding the beginning of each record-
ing which was labelled as unannotated) and also
some long sequences (over 1 minute) labelled as un-
known. There were also some short unlabelled seg-
ments, which we smoothed out using a moving aver-
age filter. We computed the magnitude of the accel-
eration
p
x
2
+ y
2
+ z
2
based on components sampled
at 512 Hz. We also used the gyroscope (sampled at
64Hz), barometric pressure (sampled at 7.1Hz) and
visible light (sampled at 3Hz). These four measures
were all up-sampled to 512Hz in order to obtain time
series of equal length. To prevent overfitting to char-
acteristics of the locations, we did not include the hu-
midity and temperature sensors, as they could poten-
tially mislead the classifier to report a false correla-
tion between location and activities. For example, if a
lot of walking data were collected under hot sun, the
classifier would see temperature as a relevant feature
to walking.
For the classification task, we used random for-
est (Breiman, 2001), a state-of-the-art ensemble clas-
sifier which also provides a certainty measure in the
classification. The random forest algorithm builds
many classification trees, where each tree votes for
a class and the forest chooses the majority label. As-
sume we have N instances in the training set and there
are M tests (based on the features) for each instance.
In order to grow a tree, N instances are sampled at
random with replacement and form the training set.
At each node, m M tests are randomly chosen and
the best one of these is determined. Each tree grows
until all leaves are pure, i.e. no pruning is performed.
A subset of the training set (normally about one-third
of the N instances) are left out to be used as a valida-
tion set, to get a running estimate of the classification
error as trees are added to the forest. The error on this
out-of-bag (OOB) data gives an unbiased error esti-
mate. This classifier is very efficient computationally
during both training and predicting, while maintain-
ing good accuracy.
We also need an probabilistic certainty measure,
which should reflect how confident the classification
is. We will use this quantity to manage the sensor
selection procedure. When using random forests, for
any given sample in the validation set, the classifier
not only predicts a label, but also reports what pro-
portion of the votes given by all trees matches the pre-
dicted label. We used this ratio as a certainty measure.
3 INITIAL EXPERIMENTS WITH
SENSOR SELECTION
Table 1: Individual classifiers. The bold line in each section
denotes the classifier with the highest accuracy.
No. Feature Set Accuracy
1 {Acc, Bar, Gyro, VisLight} 86.16
2 {Acc, Bar, Gyro} 75.16
3 {Acc, Bar, VisLight} 86.50
4 {Acc, Gyro, VisLight} 84.33
5 {Bar, Gyro, VisLight} 78.33
6 {Bar, Gyro} 54.00
7 {Acc, Gyro} 69.50
8 {Acc, Bar} 74.83
9 {Acc, VisLight} 77.66
10 {Bar, VisLight} 74.00
11 {Gyro, VisLight} 74.00
12 {Acc} 48.16
First, we wanted to verify the effect of differ-
ent subsets of sensors on the accuracy of recogniz-
ing the six different activities. We began by examin-
ing all possible combinations of sensors on the entire
data set. We treated each time sample as an instance
and used raw sensor data as features for classifica-
tion task. We performed cross-validation over users
(leaving in turn each user’s dataset aside as the test
set and combining and randomizing all other datasets
to use as training set) The accuracy of the classifiers
for all 12 possible combination sets of four sensors
1
1
Single features except the accelerometer are excluded
from the results due to poor performance.
ICPRAM2013-InternationalConferenceonPatternRecognitionApplicationsandMethods
582