EXTRACTING PERSONAL USER CONTEXT WITH A THREE-AXIS

SENSOR MOUNTED ON A FREELY CARRIED CELL PHONE

Toshiki Iso

Network Laboratories, NTT DoCoMo, Inc.

3-5 Hikarino-oka, Yokosuka, Kanagawa, Japan

Kenichi Yamazaki

Network Laboratories, NTT DoCoMo, Inc.

3-5 Hikarino-oka, Yokosuka, Kanagawa, Japan

Keywords:

Context extraction, sensors, cell phone, wavelet packet, Kohonen self-organizing map, ubiquitous service,

accelerometer.

Abstract:

To realize ubiquitous services such as presence services and health care services, we propose an algorithm

to extract ”personal user context” such as user’s behavior; it processes information gathered by a three-axis

accelerometer mounted on a cell phone. Our algorithm has two main functions; one is to extract feature vectors

by analyzing sensor data in detail by wavelet packet decomposition. The other is to ﬂexibly cluster personal

user context by combining a self-organizing algorithm with Bayesian theory. A prototype that implements

the algorithm is constructed. Experiments on the prototype show that the algorithm can identify personal user

contexts such as walking, running, going up/down stairs, and walking fast with an accuracy of about 88[%].

1 INTRODUCTION

Cell phones are becoming indispensable in daily life

and are carried everywhere due to their sophisti-

cated functions such as camera, music player, IrDA,

wireless LAN, GPS and IC-chip for electronic wal-

let (NTT DoCoMo, Inc., ). Since cell phones have be-

come personal assistants, they are well placed to iden-

tify user behavior. Mobile service providers are de-

manding user context such as user behavior because

they want to provide appropriate services that are suit-

able for the situation of the mobile user. Identifying

”personal user context” is especially important in ad-

vancing health care services and service navigation.

There are many related works on context-aware

systems using sensors in mobile environments

(Siewiorek et al., 2003), (Krause et al., 2003), (Kern

et al., 2004), (K. V. Laerhoven, 2003), (Miao et al.,

2003), (DeVaul and Dunn, ), (Clarkson et al., 2000),

(Healey and Logan, ), LBao2004, (M. Unuma, 2004),

(Randell and Muller, 2000).

However, these conventional methods have two

practical issues; one is that users need to wear some

sensors on speciﬁc parts of their bodies, the other is

that their feature extraction output is suitable only for

a limited range of user context. In regard to these is-

sues, most papers adopted the wearable computer ap-

proach and sensor data was extracted by FFT-based

approaches. Thus, they required several sensors to be

ﬁxed to different parts of the user’s bodies to achieve

a high degree of accuracy. Moreover, it was difﬁcult

to analyze the localized wave data present in the sen-

sor data because Fourier transform has lower time-

frequency resolution than wavelet transform. Their

methods are not realistic because wearing many sen-

sors is very cumbersome. On the other hand, one of

the related works (Si et al., 2005) proposed a method

of extracting context that is independent of sensor po-

sition, however, their method had limited context ex-

traction performance because feature extraction was

based on the magnitude of three-axis sensor data.

As a solution to these issues, we proposed the ”Per-

ContEx” (Iso et al., 2005) system which could ex-

tract personal user context by applying an algorithm

to data collected from the user’s cell phone. While

the system placed no constraints on how the phone

was carried, the high computational costs of the algo-

rithm meant that it was not always possible to real-

ize ubiquitous services that require real-time process-

ing. Therefore, as another approach, we propose here

a method of extracting personal user context by sub-

jecting the data collected by sensors tailored to the

user’s activities to wavelet packet decomposition. It

can identify high detailed user contexts such as walk-

ing at normal speed, running, walking fast, and going

up/down stairs. This is because the wavelet packet de-

224

Iso T. and Yamazaki K. (2006).

EXTRACTING PERSONAL USER CONTEXT WITH A THREE-AXIS SENSOR MOUNTED ON A FREELY CARRIED CELL PHONE.

In Proceedings of the International Conference on Signal Processing and Multimedia Applications, pages 224-231

DOI: 10.5220/0001568902240231

 SciTePress

composition provides a ﬁner analysis of sensor data

than FFT.

Section 2 explains the concept of processing in Per-

ConEx which extract personal user context. Section 3

describes detail algorithm at each stages in PerCon-

tEx. We then show some experiments. Finally, we

draw several conclusions and describe future works.

2 OVERVIEW OF PROCESSING

Since we use a three-axis sensor to implement our

proposals, our system must satisfy the following re-

quirements;

• feature extraction: it should deal with a variety of

personal user contexts

• context extraction: it should be independent of how

the three-axis sensor is carried.

To satisfy the ﬁrst requirement, we propose a method

to extract features from sensor data by using the Best

Basis method with wavelet packet decomposition. To

satisfy the second requirement, we propose a context

extraction method that consists of a self-organizing

algorithm and Bayesians theory; a variety of pseudo-

data is used as training data for context clustering.

We show the overall processing ﬂow of the algo-

rithm in Figure 1. The algorithm has three stages:

feature extraction, learning, and personal user context

extraction.

At ﬁrst, in order to handle the variety of signal pat-

terns which are possible, we use wavelet packet de-

composition (WPD) for wave analysis in the feature

extraction stage. This is because WPD provides ﬁner

analysis in each frequency range through the use of

localized orthogonal basis functions. Moreover, by

using information criteria, we can obtain decomposed

signals which can efﬁciently representing the features

of signal patterns. We extract feature vectors from the

decomposed signals.

Second, in order to avoid the inﬂuence of data ﬂuc-

tuation and noise, we use the Kohonen self-organizing

map (KSOM) as a non-linear learning method. This

is because KSOM can provide not only robustness

to noise and signal ﬂuctuation, but also ﬂexibility in

terms of the number of clusters. By calculating ap-

pearance probabilities of personal user contexts, we

can extract the most representative feature vectors

and time-series data while avoiding the vague vectors.

Moreover, in order to achieve robustness with regard

to where the sensors are set, we prepare a variety of

KSOMs and train them using pseudo-data generated

from the most representative time-series data by rota-

tional transformation.

Third, in order to extract personal user context, we

use all feature maps including the appearance possi-

bilities of personal user context trained by both ob-

served and pseudo-data. Finally, we can extract the

personal user context from three-axis sensor data by

setting adequate criteria. We describe the three stages

below.

3 ALGORITHM

3.1 Feature Extraction Stage

Noise is naturally present in time-series data from

sensors mounted on mobile devices. Moreover, in

the case of cell phones carried freely, the time-series

data often contain signals generated by non-targeted

behavior. It is difﬁcult to make a comprehensive

disturbance model because of the variety of distur-

bances. Under the assumption that the majority of

time-series data is generated by the targeted behav-

iors, we propose a method to extract features by an-

alyzing the time-series data decomposed by a combi-

nation of wavelet packet and information entropy.

3.1.1 Sensor Data Extraction and Preprocessing

At ﬁrst, we divide the time-series data from a three-

axis sensor into frame data. We deﬁne t

as the time

range. We assume that the three-axis sensor does not

experience twisting in time frame t

as follows.

(s − 1)∆T ≦ t

(s + 1)∆T }

(s = 1, 2, · · · ) (1)

where ∆T is frame length. To divide the sensor data,

some overlap time must be set to get data stability.

At this time, we deﬁne an observed three-axis sensor

data a(t) as follows;

a(t) = (a

(t), a

(t)) (2)

Feature extraction is preceded by preprocessing oper-

ations such as data and time calibration.

3.1.2 Wavelet Packet Decomposition

Time-series data analysis is commonly based on

Fourier transform. Actually, it is a very strong tool

for feature extraction if we can assume that the in-

put represents regular periodic data that extends over

relatively long periods. Unfortunately, this assump-

tion fails in the case of time-series data generated

by the user’s behavior; the data here is generated by

many short-term events. Therefore, wavelet trans-

form should be applied rather than Fourier transform.

Wavelet transform is very common in the ﬁelds of sig-

nal processing applications such as image compres-

sion and analysis of electrocardiograms (Ishikawa,

2000).

EXTRACTING PERSONAL USER CONTEXT WITH A THREE-AXIS SENSOR MOUNTED ON A FREELY

CARRIED CELL PHONE

225

Sensory Data

Extraction

Wavelet Packet

Decomposition

Best Basis Function

Selection based on

AIC

Sensory Data

Preprocessing

XYZ-Feature Pattern

by composition of

Entropy Distribution

on each axis

Self-Organizing

Feature Map

(Training Stage)

Dictionary Feature

Map Extraction

XYZ-Feature Pattern

by composition of

Statistic Values of

Entropy Distribution

on each axis

Specific Patterns

Occurrence

Probabilities Map

based on Best

Matching Units

Basis Feature

Pattern Extraction

Basis Sensory Data

Extraction

Pseudo-Sensory

Data Generation

Sensory Data

Extraction

Self-Organizing

Feature Map

(Clustering Stage)

Specific Patterns

Extraction

Figure 1: Processing ﬂow.

Wavelet packet analysis provides more detail be-

cause it uses a splitting algorithm that downsamples

not only the scaling components but also the wavelet

components.

In order to realize the ﬂexible analysis of sen-

sor data, we use discrete wavelet packet decomposi-

tion (DWPD). By using DWPD, the three-axis sen-

sor data a(t) can be represented at level p (p =

0, 1, . . . , p

max

) as follows;

a(t) =

−1

q=0

(p,q)

(t) (q = 0, 1, . . . , 2

max

−1) (3)

where u

(p,q)

(t) is q-th of decomposed data at level p.

3.1.3 Best Basis based on Information Entropy

In order to ﬁnd the best basis for a(t), we deﬁne a cost

function of information entropy L

(p, q) as follows;

(p, q) = −

(p,q)

log |u

(p,q)

(i = x, y, z) (4)

We ﬁnd the pairs of (p

(k), q

(k)) that minimize

(p, q).

(k), q

(k)) = arg max

p,q

(p, q)) (5)

where k is kth of the best basis. Then, a

) (i =

x, y, z) can be represented as follows;

) =

(k),q

(k))

) (6)

)0,0(

)0,2(

)1,2(

)2,2(

)3,2(

)0,1(

)1,1(

)0,3(

)1,3(

)2,3(

)3,3(

)4,3(

)5,3(

)6,3(

)7,3(

Generate Feature Vector from Distribution of information entropy

max

pmaxp

BiBiBi

))k(q),k(p(L

−

0)1(Xd

)2(Xd

ω∆

ω∆2

⋅⋅⋅

⋅

)k(Xd

max

0)1(Xs

)2(Xs

ω∆

ω∆2

⋅⋅⋅

)n(Xd

Periodgrams

Distribution

Information

Entropy

Distribution

max

=ω∆

)0,0(

)0,2(

)1,2(

)2,2(

)3,2(

)0,1(

)1,1(

)0,3(

)1,3(

)2,3(

)3,3(

)4,3(

)5,3(

)6,3(

)7,3(

)0,0(

)0,2(

)1,2(

)2,2(

)3,2(

)0,1(

)1,1(

)0,3(

)1,3(

)2,3(

)3,3(

)4,3(

)5,3(

)6,3(

)7,3(

Generate Feature Vector from Distribution of information entropy

max

pmaxp

BiBiBi

))k(q),k(p(L

−

0)1(Xd

)2(Xd

ω∆

ω∆2

⋅⋅⋅

⋅

)k(Xd

max

pmaxp

BiBiBi

))k(q),k(p(L

−

0)1(Xd

)2(Xd

ω∆

ω∆2

⋅⋅⋅

⋅

)k(Xd

max

0)1(Xs

)2(Xs

ω∆

ω∆2

⋅⋅⋅

)n(Xd

Periodgrams

Distribution

Information

Entropy

Distribution

max

=ω∆

Figure 2: Feature Vectors generated by Wavelet Packet De-

composition.

3.1.4 Deﬁnition of Feature Vectors

As a wavelet packet is a kind of bandpass ﬁlter,

each u

(k),q

(k))

are ﬁltered at their frequency

ranges. We calculate both periodgrams and informa-

tion entropy of each feature each u

(k),q

(k))

(i =

x, y, z) (see Figure 2). Next, we calculate the feature

vectors Xs

by distributing the periodgrams of the

best basis to the maximum level p

max

and we deﬁne

the feature vectors Xs as follows;

Xs =





(7)

where

= (Xs

(1), . . . ,

(n), . . . , Xs

max

)) (8)

SIGMAP 2006 - INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA

APPLICATIONS

226

(a) Walking on flat road(a) Walking on flat road

(b) Going up stairs(b) Going up stairs

Figure 3: Trajectory of Three-Axis sensor data.

(n)

2∆T

(k),q

(k))

)exp(−jω

(9)

and

max

n(p, q) (10)

max

−p

q ≦ n(p, q) ≦ 2

max

−p

(q + 1) − 1 (11)

where ω

represents frequency. The above formula

does not contain Xs

(0), the DC component, be-

cause basically it contains gravity of center, moreover,

not DC but AC components contain effective features

for determining user behavior. Conventional methods

have a critical constraint in that sensors must be ﬁxed

to speciﬁc parts of the user’s body. Therefore, in order

to extract what context they wanted, they selected spe-

ciﬁc components of three-axis sensor data or used the

magnitude of three-axis sensor data. However, these

methods can extract only a limited range of contexts

because they treat the sensor data as scalar data, not

vector data.

Figure 3 shows trajectories of three- axis acceler-

ation data for the user actions of walking on a ﬂat

road and up stairs. Both actions yield characteristic

three-dimensional patterns, a kind of chaos attractor.

It is difﬁcult to detect features accurately by project-

ing just one or two axes. Therefore, we use all com-

ponents of sensor data as vector data which allows us

to assess as many kinds of user behavior as possible.

We deﬁne two kinds of feature vectors, Xs

and

, as vector and statistic of information entropy, re-

spectively. X s

is the feature vector used to detect

similar sensor data while M

is for analyzing the at-

tribution of sensor data.

At ﬁrst, we deﬁne the feature vector X d which is

composed of feature vector Xd

by distributing the

information entropy of the best basis to the maximum

level p

max

as follows;

Xd =





(12)

where

= (Xd

(1), . . . ,

(k), . . . , Xd

max

−1

)) (13)

(k) =

(k), q

(k))|

max

−p

(14)

This allows us to calculate the momenta M

as fol-

lows;

= (M

(1), . . . , M

(l), . . . , M

)) (15)

where

(Xd

) =

(l − M(1))

(l)

(n = 1, 2, 3, 4) (16)

and

M(1) =

lXd

(l)

(17)

Afterward, we use the Xs = (Xs

)

and M = (M

) as feature vectors of frame

in three-axis sensor data for extracting personal user

context.

3.2 Training Data Learning Stage

Most human action such as walking exhibits ﬂuctu-

ations and noise. Moreover, they are always chang-

ing. This suggests that non-linear approaches to clus-

tering them are better than linear ones such as prin-

ciple components analysis, because the linear ap-

proaches are sensitive to the inﬂuences of ﬂuctua-

tion and noise. From among non-linear approaches

available, we choose the Khonen self-organizing

map (KSOM) (Kohonen., ) for clustering personal

user context such as human behavior. This is because

KSOM offers competitive learning and also has rea-

sonable computational costs; it eliminates the need

to set the number of clusters beforehand. However,

EXTRACTING PERSONAL USER CONTEXT WITH A THREE-AXIS SENSOR MOUNTED ON A FREELY

CARRIED CELL PHONE

227

Table 1: Dictionary Data Example.

Feature Vector Personal User Context

Xd(t

) Walking at normal speed

Xd(t

)

Walking fast

· · ·

Xd(t

)

going up to stairs

Xd(t

k+1

)

going down to stairs

· · ·

Xd(t

k+N

)

running

KSOM has several problems. One of them is its

weakness in deciding cluster borders on the learnt

feature map because of the existence of uncertain

cells. In order to overcome this problem, we apply

a probabilities-based technique after learning the fea-

ture map. Under our proposal, we can prepare dic-

tionary data (see Table 1) whose labels are kinds of

personal user context. This allows us to calculate the

probability. By using the probability information, we

can obtain the sensor data most representative of each

personal user context.

3.2.1 Competitive Training by KSOM

We use feature vector Xs as main feature vector and

feature vector M as auxiliary feature vector, respec-

tively. Because Xs can contain more detail charac-

ters of sensor data than M . At this stage, we use fea-

ture vector Xs as input vectors X

to train KSOM.

They are generated from three-axis sensor data asso-

ciated with each personal user context. In training, the

KSOM measure and update rule are set as follows.

− X

k = min

{kX

− X

k} (18)

Nearest

(t + 1) = X

Nearest

(t)

+ δ(t)h

(r(t)){X

(t) − X

Nearest

(t)} (19)

where X

(t), X

Nearest

(t), X

(t) and c are the vec-

tors of the cell nearest to X

(t), the neighborhood

vectors of X

(t), the input feature vectors at time t

in training and a cell in the KSOM, respectively. δ(t),

and r (t) are the learning rate, the neighborhood

kernel around the winning unit n

, and neighborhood

radius, respectively. After training, we can obtain a

2D array of cells on the feature map represents the

distribution of training data. The cells also hold infor-

mation about the number of times each personal user

context appeared.

3.2.2 Detecting Representative Vectors for

Personal User Context

Based on the information about the number of times

in each personal user context appears, we make a

probability map by calculating the personal user con-

text appearance probabilities in each cell. We de-

ﬁne the personal user context appearance probabili-

ties P (C

) as follows;

P (C

) =

P (X

)P (C

)

P (X

)P (C

)

(20)

We also deﬁne probabilities P (X

) and

P (C

P (X

) =

N(X

))

N(X

))

(21)

P (C

) =

N(C

)

P (C

)

(22)

where C

and X

represent the appearance of per-

sonal user context n

and the selected cell on the fea-

ture map as nearest to input vector X , respectively.

P (X

) and P (C

) are the conditional prob-

ability of appearance C

given X

and the prior

probability of C

, respectively. N(X

)) rep-

resents amount of X

We use the probability map of personal user context

appearance for two purposes; to ﬁnd representative

feature vectors for personal user context and to output

possibilities of the existence of personal user context

from three-axis sensor data. To realize the former,

we ﬁnd the representative feature vectors when the

probability satisﬁes the below condition.

P (C

) > T h

(23)

where T h

is a threshold to judge whether X

representative of C

. We can obtain the best repre-

sentative sensor data corresponding to X

which are

found as described above.

3.2.3 Training by using Pseudo Sensor Data

In order to achieve robustness with regard to the

positions of the sensor, we have already proposed

a method based on non-linear simultaneous equa-

tions. Unfortunately, its computational cost is exces-

sive and it often fails to provide the real-time process-

ing needed by ubiquitous services. Therefore, the

feature map includes the possibilities generated from

not only observed training data but also pseudo-data,

which are generated from the best representative sen-

sor data by rotational transformation. After obtaining

the best representative sensor data (see section 3.2.2),

we transform them using the rotation matrix as de-

scribed below;

1 0 0

0 cos α

− sin α

0 sin α

cos α

cos β

0 sin β

0 1 0

− sin β

0 cos β

cos γ

− sin γ

sin γ

cos γ

0 1 0

(24)

SIGMAP 2006 - INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA

APPLICATIONS

228

Table 2: Probability Map Example.

Feature Probabilities [%]

Vector WN WF · · · US

82.3 0.0 · · · 17.7

73.7 25.6 · · · 0.7

· · ·

· · · · · · · · · · · ·

28.6 0.0 · · · 71.4

where α

, β

and γ

are the angles around x, y and z-

axis, respectively. By increasing these angles in steps

of ∆θ, we can obtain a variety of pseudo sensor data

. Subsequent training with each pseudo-data in an-

gle sets (α

, β

, γ

) following the approach given in

section 3.2.1 and 3.2.2, yields as many number of fea-

ture maps as the angle sets.

3.3 Personal User Context

Extraction Stage

In order to recognize personal user context from

three-axis sensor data, we use all feature maps with

the possibilities trained by both observed and pseudo

sensor data. The speciﬁc pattern detection stage pro-

ceeds as follows;

1. Extract three-axis sensor data

2. Convert them into frame data and calibrate them in

terms of scale

3. Decompose them using the best basis approach by

wavelet packet decomposition and calculate feature

vectors from their spectrogram

4. Calculate residential errors between them and the

nearest cells in each feature map

5. Find the most suitable cell X

for the below con-

ditions;







= arg min

{kX − X

− X

k < T h

Corr(M

, M

) > T h

corr

(25)

where M

and M

are the feature vectors of

the information entropy distributions of X

and

, respectively. Corr(M

, M

) means the nor-

malized crosscorrelation between M

and M

T h

and T h

corr

are thresholds to judge feature

vector similarity.

An example of the probability map described in

the previous section in given in the below table.

Retrieving the most suitable cell X

from Table 2

yields personal user context.

Mobile Network

User Context

Management Server

User A

Sensory Data

Detection

Sensory Data

Analysis

User Context

Extraction

Specific Signal

Data Extraction

Sensory Data

Receiver

Sensory Data

Transmitter

Presence

Information

Receiver

Presence Information

Conversion

(includes Security

Management)

Presence

Information

Transmitter

Specific Signal Data

⇔

⇔⇔

⇔User Context DB

User Context

⇔

⇔⇔

⇔Presence Information DB

Extracting physical user context

without wearing sensors on his body

User X with multi sensors

mounted on a cell phone

When a user is moving

Figure 4: PerContEx System.

Figure 5: A Three-Axis Accelerometer mounted on A Cell

Phone.

4 EXPERIMENTS

We validate of our proposed method using a prototype

system. The experiments targeted four user contexts

associated with movement: walking, going up/down

stairs, walking rapidly, and running. These targets are

thought to be relatively hard to discriminate as well as

being extremely useful in health advice services such

as calculating a person’s calorie consumption.

4.1 Experimental Conditions

Our system is composed of a cell phone with three-

axis sensor and a PerContEx server. The cell phone

transmits the sensor data to the PerContEx server. The

PerContEx server extracts the user’s context the data

and send the results of context extraction to the cell

phone or another personal computer (see Figure 4).

The experimental conditions of the system are as fol-

lows; We show experimental conditions of the system

as follows;

• System speciﬁcations

EXTRACTING PERSONAL USER CONTEXT WITH A THREE-AXIS SENSOR MOUNTED ON A FREELY

CARRIED CELL PHONE

229

– Cell phone: IMT-2000 Terminal

(NTT DoCoMo P900i, see Figure 5)

– Sensor: three-axis accelerometer

(Hitachi: H48C)

– PerContEx server: Personal Computer

(DELL: Dimension9150)

∗ OS:WindowsXP Pro.

∗ CPU: PentiumD830 3[GHz]

∗ MEM:2[GB]

• Parameters

– Mother wavelet: Harr Function

– Scale level: 5

– KSOM: 15[cell] × 10[cell]

• Data speciﬁcations

– Personal user context: ﬁve kinds of gait pattern

∗ walking at normal speed (WN),

∗ going up stairs (US)

∗ going down stairs (DS)

∗ walking fast (WF)

∗ running (RN)

– Carrying style: placed in user’s breast pocket

and hip packet

– Real data: data sampling: 100[Hz]

∗ training data 30[min]

∗ test data 50[min]

– Pseudo-data:

∗ step of rotation angle15[deg]

∗ range of rotation angle (round x, y, z-axis, re-

spectively) from 0[deg] to 45[deg]

– frame length: 3[sec] (overlap time: 1.5[sec])

4.2 Results and Estimation

Figure 6 shows the results of identifying ”walking at

normal speed” and ”going up stairs”. As both move-

ments contain the behavior of ”walking”, we can see

similar wave patterns at the best basis level 2 and 3.

Thus, the other best bases are useful for discriminat-

ing between ”walking at normal speed” and ”going

up stairs” and noise. After feature extraction, we

made a feature map with personal user context ap-

pearance probabilities by using real data and pseudo

sensor data as described above. Figure 7 shows the

resulting feature map. We can see that each personal

user context has its own feature map. Table 3 Lists the

accuracy of personal user context extraction. In the

experiment, we identiﬁed the context which has the

maximum probabilities in probabilities of the other

contexts.

According to the above Table 3, our proposed al-

gorithm can extract and discriminate the above gaits

as personal user context independent of cell phone

(a) Walking at normal speed

(b) Going up stairs

Similar Waves

Best Basis 3.1

Best Basis 2.1

Figure 6: Sensor Data Decomposited by Wavelet Packet.

Table 3: the accuracy of personal user context extraction.

Cell Phone Probabilities [%]

Position WN US DS WF RN

Breast pocket 83.2 79.3 76.6 88.4 94.2

Hip pocket

85.1 88.5 72.1 94.9 92.0

Average 84.2 84.1 74.4 91.6 93.1

position. Other tests showed that the proposed al-

gorithm can extract not only walking (WN) and run-

ning (RN) but also going up/down stairs (US/UD) and

walking fast (WF). The most difﬁcult gait to identify

was DS (”going down stairs”). We think that DS has

weaker constraints on body movement than the other

gaits such as US (”going up stairs”) and RN (”run-

ning”). Many DS events were misidentiﬁed as ”walk-

ing at normal speed” or ”walking fast”. We need to

enhance the algorithm to provide a more detail analy-

sis of the sensor data.

5 CONCLUSION

We proposed an algorithm to extract personal user

context; it uses feature extraction based on wavelet

packet decomposition and KSOM with the context

appearance probabilities. Experiments on a prototype

system showed that in case that the cell phone with the

sensor was not ﬁxed in user’s pockets, it could extract

personal user context at the accuracy of about 88[%].

Moreover, we showed that, with only a single three-

axis accelerometer, it could extract personal user con-

texts such as walking, running, walking fast, and go-

ing up/down stairs at the accuracy of about 80[%].

SIGMAP 2006 - INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA

APPLICATIONS

230

Walking at normal speed Going up stairs Going down stairs

Walking fast Running

Walking at normal speed Going up stairs Going down stairs

Walking fast RunningWalking fast Running

Figure 7: Learnt Feature Map with Probability Information.

The results prove that the algorithm can not only

reduce the constraints placed on users but also be ap-

plied for ubiquitous services which require personal

user context.

Future works is to realize a more comprehensive

analysis of sensor data in order to identify personal

user context in more complicated situations such as

carrying the cell phone by hand. We will then apply

the algorithm to ubiquitous services such as presence

and health care services.

ACKNOWLEDGEMENTS

The authors would like to thank Dr. K. Imai, NTT

DoCoMo Network Labs for his encouragement. We

also thank Mr. M. Kinoshita, NTT-IT Corporation and

Mr. H. Miyoshi, Seiko Epson Corporation for their

technical support in the experiments.

REFERENCES

Clarkson, B., Mase, K., and Pentland, A. (2000). Recogniz-

ing user’s context from wearable sensor’s: Baseline

system. Technical Report 519.

DeVaul, R. W. and Dunn, S. Real-Time Motion

Classiﬁcation for Wearable Computing Applications.

http://www.media.mit.edu/wearables/mithril/

realtime.pdf.

Healey, J. and Logan, B. Wearable Wellness Mon-

itoring Using ECG and Accelerometer Data.

http://www.hpl.hp.com/techreports/2005/HPL-2005-

134.pdf.

Ishikawa, Y. (2000). Wavelet Analysis for Clinical Medi-

cine. IGAKU-SHUPPAN.

Iso, T., Kawasaki, N., and Kurakake, S. (2005). Per-

sonal context extractor with multiple sensors on a cell

phone. In 7th IFIP International Conference on Mo-

bile and Wireless Communications Networks, 2005.

K. V. Laerhoven, A. Schmidt, H. W. G. (2003). Limitations

of multi-sensor context-aware clothing. 7(3).

Kern, N., Antifakos, S., Schiele, B., and Schwaninger,

A. (2004). A model for human interruptability: ex-

perimental evaluation and automatic estimation from

wearable sensors. In 8th IEEE International Sympo-

sium on Wearable Computers.

Kohonen., T. SOM Toolbox 2.0. The Labora-

tory of Information and Computer Science

in the Helsinki University of Technology,

http://www.cis.hut.ﬁ/projects/somtoolbox/.

Krause, A., Siewiorek, D. P., Smailagic, A., and Farring-

don, J. (2003). Unsupervised, dynamic identiﬁca-

tion of physiological and activity context in wearable

computing. In 7th IEEE International Symposium on

Wearable Computers, pages 8–17.

M. Unuma, K. Kurata, A. T. T. H. (2004). Autonomy posi-

tion detection by using recognition of human walking

motion. J87-A(1):78–86.

Miao, T., Shimizu, T., Miyake, S., Hashimoto, M., and Shi-

moyama, O. (2003). Alteractions of complexity mea-

sures during subsidiary task in multi-attribute opera-

tions. In Human Interface Symposium, pages 791–

792.

NTT DoCoMo, Inc. http://www.nttdocomo.com/home.html.

Randell, C. and Muller, H. (2000). Context awareness

by analysing accelerometer data. Technical Report

CSTR-00-009.

Si, H., Kawahara, Y., Kurasawa, H., Morikawa, H., and

Aoyama, T. (2005). A context-aware collaborative ﬁl-

tering algorithm for real world oriented content deliv-

ery service. In Ubicom2005 Metapolis and Urban Life

Workshop.

Siewiorek, D., Smailagic, A., Furukawa, J., Krause, A.,

Moraveji, N., Reiger, K., Shaffer, J., and Wong, F. L.

(2003). Sensay: A context-aware mobile phone. In 7th

IEEE International Symposium on Wearable Comput-

ers, pages 248–257.

EXTRACTING PERSONAL USER CONTEXT WITH A THREE-AXIS SENSOR MOUNTED ON A FREELY

CARRIED CELL PHONE

231