3 KEY IDEAS
Some past work only consider one sensor to do au-
thentication (Buthpitiya et al., 2011)(Trojahnand Ort-
meier, 2013)(Li et al., 2013)(Nickel et al., 2012).
We will show that the authentication accuracy can be
improved by taking other sensors into consideration.
We propose a multi-sensor-based technology with a
machine learning method for implicit authentication,
which only takes a short time to detect the abnormal
user, but also needs less than 10 seconds to retrain
the user’s profile every day. First, we collect the data
from the selected sensors. Then, we use the SVM
technique as the classification algorithm to differenti-
ate the usage patterns of various users and authenti-
cate the user of the smartphone.
Our methodology can be extended to other sen-
sors in a straight-forward manner. Figure 1 shows our
methodology, and the key ideas are presented below.
3.1 Sensor Selection
There are a lot of sensors built into smartphones
nowadays as shown in Table 1 and Table 2. With
smartphones becoming more connected with our
daily lives, a lot of personal information can be stored
in the sensors. The goal is to choose a small set of sen-
sors that can accurately represent a user’s characteris-
tics. In this paper, we experiment with three sensors
that are commonly found in smartphones: accelerom-
eters, orientation sensors and magnetometers. They
also represent different information about the user’s
behavior and environment: the accelerometer can de-
tect coarse-grained motion of a user like how he walks
(Nickel et al., 2012), the orientation sensor can de-
tect fine-grained motion of a user like how he holds a
smartphone (Xu et al., 2012), and the magnetometer
measurements can perhaps be useful in representing
his environment. Furthermore, these sensors do not
need the user’s permission to be used in Android ap-
plications, which is useful for continuous monitoring
for implicit authentication.
Also, these three sensors do not need the user to
perform a sequence of actions dictated by a script–
hence facilitating implicit authentication. Note that
our method is not limited to these three sensors, but
can be easily generalized to different selections of
hard or soft sensors, or to incorporate more sensors.
3.2 Data Sets and Re-sampling
We use two data sets, a new one which we collected
locally by ourselves which we call the PU data set,
and another data set which we obtained from the au-
thors of a published paper (Kayacık et al., 2014),
which we call the GCU data set.
The PU data set is collected from 4 graduate stu-
dents in Princeton University in 2014 based on the
smartphone, Google Nexus 5 with Android 4.4. It
contains sensor data from the accelerometer, orien-
tation sensor and magnetometer with a sampling rate
of 5 Hz. The duration of the data collected is approx-
imately 5 days for each user. Each sensor measure-
ment consists of three values, so we construct a vec-
tor from these nine values. We use different sampling
rates as a factor in our experiments, to construct data
points.
We usethe second data set, called the GCU dataset
version 2 (Kayacık et al., 2014), for comparison. This
is collected from 4 users consisting of staff and stu-
dents of Glasgow Caledonian University. The data
was collected in 2014 from Android devices and con-
tains sensor data from wifi networks, cell towers, ap-
plication use, light and sound levels, acceleration, ro-
tation, magnetic field and device system statistics.
The duration of the data collected is approximately
3 weeks. For better comparison with our PU data set,
we only use the data collected from the accelerometer,
orientation sensor and magnetometer data.
The sensor measurements originally obtained are
too large to process directly. Hence, we use a re-
sampling process to not only reduce the computa-
tional complexity but also reduce the effect of noise
by averaging the data points. For example, if we want
to reduce the data set by 5 times, we average 5 con-
tiguous data points into one data point. In section 4,
we will show that the time for training a user’s profile
can be significantly reduced by re-sampling.
3.3 Support Vector Machines
The classification method used by prior work did not
give very accurate results. Hence, we propose the use
of the SVM technique for better authentication accu-
racy.
Support Vector Machines (SVMs) are state-of-
the-art large margin classifiers, which represent a
class of supervised machine learning algorithms first
introduced by (Vapnik and Vapnik, 1998). SVMs
have recently gained popularity for human activity
recognition on smartphones (Anguita et al., 2012).
In this section, we provide a brief review of the re-
lated theory of SVMs (Cristianini and Shawe-Taylor,
2000), (Vapnik and Vapnik, 1998).
After obtaining the features from sensors, we use
ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy
274