our experiments. Section 6 describes the software im-
plementation of our proposed work by introducing a
web-browser extension. Finally, Section 7 presents
the conclusions of our paper and includes a discus-
sion of possible directions for future work.
2 RELATED WORK
Several works have been piblished on the long last-
ing problem of intrusion detection. For example, the
work in (Saber, 2019) describes a machine learning-
based approach to detecting vehicle theft by analyzing
the anomalies in the driving behavior of the user. An-
other example is the work in (Manikoth et al., 2018),
where the authors implemented a detection mecha-
nism based on several classifiers with the goal to find
the best subset of features to identify unauthorized
use of a mobile device. In (Huang et al., 2021), the
authors propose a new dataset to authenticate users
to their own mobile devise. This last work relies on
gesture-based authentication, where the sensors infor-
mantion are used to train machine learning models.
As seen in this examples, in order to achieve machine
learning-based user authentication, there can be dif-
ferent approaches to accomplish it. We describe now
different of such approaches proposed in academia
over the past few years. In particular, we concentrate
on browser intrusion and insider threat attacks. Three
works step forward due to their promising results and
innovative solutions in the field of intrusion detec-
tion. One approach was to intervene with users while
they are using the computer periodically (Chen et al.,
2014). The approach was proven to be successful in
terms of achieving high accuracy to predict a legiti-
mate user. For five seconds of verification, it achieved
2.86% False Rejection Rate (FRR) and 4.00% False
Acceptance Rate (FAR). Where, FAR is the per-
centage of identifications in which unauthorised in-
dividuals are incorrectly accepted (also called fraud
rate), while FRR is the percentage of identifications in
which authorised individuals are incorrectly rejected
(also called insult rate). Another method was to ex-
tract features from mouse operations (Jorgensen and
Yu, 2011). This approach was also fairly successful,
resulting in roughly 2% of both FRR and FAR. How-
ever, the average time to authenticate users was rela-
tively long, for example several minutes to sometimes
even more than 10 minutes. Although it can predict a
malicious user, within that time frame, the malicious
user can still performs a considerable amount of dam-
age to the victim’s data. Hence, the fact that it takes
too long to authenticate users makes this approach not
feasible to be applied in the real world. The work pro-
posed in this paper takes inspiration from a research
which proposed to use CNN algorithm to authenti-
cate users using mouse dynamics (Hu et al., 2019).
In this approach, the authors converted mouse opera-
tions into JPEG images following certain rules. They
used an open-sourced dataset for mouse dynamics.
And with that, they were able to achieve 2.96% FAR
and 2.27% FRR within only seven seconds. The great
advantage of this approach is that it does not extract
features from a model, and does not miss any infor-
mation from the user actions. Furthermore, it does not
require any other algorithms to extract features from
the dataset. For this reason, we decided to improve
further in the direction undertaken by this interesting
work. In particular, we decided to apply Natural Lan-
guage Processing (NLP) algorithms to the extracted
user data.
3 BACKGROUND
In this Section, we describe different types of machine
learning algorithms that we applied for mouse dynam-
ics user authentication in our experiments
3.1 CNN
Convolutional Neural Network (O’Shea and Nash,
2015) (CNN) is a type of neural network algorithm
that is popular for analyzing images and detecting pat-
terns. One of the characteristics of the network is that
it has multiple convolutional layers, and those layers
are responsible for finding patterns in input images.
One can think of an input image as a representation
of a mathematical matrix. CNN internally applies fil-
ters to groups of cells in the matrix, and this results in
the pattern recognition. There are many types of filter
for recognizing different patters, for instance simple
shapes such as squares or edges. By utilizing those fil-
ters, more sophisticated shapes can be detected, such
as a dog, a cat, or a human’s face. A representation
of the patterns recognized by a CNN is shown in Fig-
ure 1 for different classified objects.
3.2 LSTM
LSTM stands for long short term memory and it is
a type of recurrent neural network (RNN) algorithm.
Traditionally, RNN had an issue of short term mem-
ory where important data could not be propagated to
the final layers during prediction. LTSM aims to re-
duce this problem. RNNs usually have multiple short
term memory cells. In the case of LSTM, we have
a memory cell which includes scope for long term
NLP-based User Authentication through Mouse Dynamics
697