MASD: Malicious Web Session Detection Using ML-Based Classifier
Dilek Yılmazer Demirel
a
and Mehmet Tahir Sandıkkaya
b
Department of Computer Engineering, Istanbul Technical University, Istanbul, Turkey
Keywords:
Malicious Web Session Detection, Machine Learning, Classification.
Abstract:
The development of web applications and services has resulted in an increase in security concerns, especially in
identifying malicious web session attacks. Malicious web sessions pose a significant risk to users, potentially
resulting in data breaches, illegal access, and other malicious activities. This study presents an innovative tech-
nique for detecting malicious web sessions using a machine learning-driven classifier. To examine the features
of web sessions, the suggested technique combines an embedding layer and machine learning approaches.
Three different datasets were used in the empirical studies to confirm the effectiveness of the approach. They
include a unique compilation of Internet banking web request logs, provided by Yap Kredi Teknoloji, as well
as the well-known HTTP dataset CSIC 2010 and the publicly accessible WAF dataset. The experimental
results are compared to known approaches such as Random Forest, Convolutional Neural Networks (CNN),
Support Vector Machines (SVM), Na
¨
ıve Bayes, Decision Trees, DBSCAN, and Self-Organizing Maps (SOM).
The actual findings demonstrate the superiority of the suggested technique, especially when Random Forest is
used as the chosen classifier. The attained accuracy rate of 99.17% surpasses the comparison methodologies,
highlighting the approach’s ability to efficiently identify and block malicious web sessions.
1 INTRODUCTION
The digital interaction, communication, and com-
merce landscape has been profoundly reshaped by
the widespread integration of web-based applications.
As users increasingly navigate these virtual environ-
ments, their activities have become pivotal to mod-
ern life. However, this rapid technological evolution
has not only heralded unprecedented convenience but
also ushered in an era of heightened vulnerability. In
tandem with this progress, malicious actors have in-
geniously exploited the expanding digital infrastruc-
ture, thereby imperiling the security and privacy of
web users. In particular, attacks intended for web
sessions have emerged as persistent threats, thereby
necessitating the development of potent and stream-
lined strategies for their identification and prevention
(Mansfield-Devine, 2022; Toprak and Yavuz, 2022).
Cybersecurity metrics underscore the gravity of the
situation, revealing that a substantial 68% of web ap-
plications are susceptible to breaches involving sen-
sitive data (PurpleSec, 2022). Equally concerning is
the report indicating that web applications serve as the
conduit for 56% of all cyberattacks, 95% of which
a
https://orcid.org/0000-0002-4008-4478
b
https://orcid.org/0000-0002-9756-603X
are driven by financial motives (Mansfield-Devine,
2022). In the face of these mounting challenges,
conventional rule-based attack prevention systems are
found to be incapable of adapting to swiftly evolving
attack vectors. As a result, increasing number of re-
searchers have turned their attention to the realm of
Machine Learning (ML)-based solutions as a promis-
ing avenue for the detection of malicious HTTP ses-
sions, in pursuit of a more agile and effective defense
paradigm.
Clustering algorithms have garnered significant
attention among researchers seeking effective solu-
tions for the detection of malicious web sessions. In
various studies, clustering algorithms such as DB-
SCAN, SOM, and Modified ART2 have been em-
ployed to distinguish between legitimate and mali-
cious sessions, often coupled with a feature extrac-
tion phase (Sun et al., 2020; Stevanovic et al., 2011;
Sadeghpour et al., 2021). During the feature extrac-
tion phase, these investigations extract a constrained
set of features from web sessions, which subsequently
serve as input for the clustering algorithms. This
method facilitates session-based clustering to discern
and isolate malicious sessions. Complementary to
these unsupervised methodologies, several supervised
approaches have also emerged. These endeavors not
Yılmazer Demirel, D. and Sandıkkaya, M.
MASD: Malicious Web Session Detection Using ML-Based Classifier.
DOI: 10.5220/0012174800003595
In Proceedings of the 15th International Joint Conference on Computational Intelligence (IJCCI 2023), pages 487-495
ISBN: 978-989-758-674-3; ISSN: 2184-3236
Copyright © 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
487
only aim to classify web sessions into malicious and
non-malicious categories but also employ feature ex-
traction techniques akin to their unsupervised coun-
terparts (Goseva-Popstojanova et al., 2012). In the
context of this discourse, the present paper intro-
duces an innovative approach to the detection of ma-
licious web sessions utilizing a Machine Learning
(ML)-based classifier. In a departure from conven-
tional techniques, this approach eschews the feature
extraction phase that typically involves a limited fea-
ture set. In this manner, this novel approach achieves
heightened accuracy and a reduced false-positive rate,
thereby fortifying its efficacy in the domain of finan-
cial security. Major contributions are summarized in
the list below:
Introducing and implementing a new approach to
detect malicious web sessions using the Random
Forest as a classifier with the embedding layer, as
described in Section 4.
Implementing benchmark models such as SVM,
Na
¨
ıve Bayes, CNN, Decision Tree, DBSCAN,
and SOM, as given in Section 6.
Using proposed models and benchmarks on three
different datasets (a unique financial dataset, the
CSIC 2010 HTTP dataset (Gim
´
enez et al., 2012),
and the WAF dataset (Ahmad, 2017)) to compare
performance results after the training and the test
steps of the model, as shown in Section 7.
The subsequent sections of this paper are struc-
tured as follows: Section 2 provides an overview of
related research, outlining current methodologies and
outcomes in the realm of malicious web session de-
tection. In Section 3, an exposition is provided re-
garding the datasets employed, accompanied by a de-
tailed elucidation of how the samples within these
datasets are categorized. Section 7 offers a presenta-
tion of experimental results and comparative analyses,
shedding light on the performance of the proposed ap-
proach in contrast to existing methods. Finally, the
conclusions and implications drawn from the findings
of this study are described in Section 7.2.
2 RELATED WORK
Malicious web session detection has attracted the at-
tention of several studies. Both supervised and unsu-
pervised approaches are experimented to detect ma-
licious sessions. In unsupervised approaches, dif-
ferent clustering algorithms together with a feature
extraction method are preferred. For instance, Sun
et al. proposes an unsupervised learning technique
called WSAD to detect anomalies utilizing web ses-
sion data. The authors realize the limits of standard
approaches that rely on labeled data and present an
alternative. The WSAD technique involves a feature
extraction mechanism that converts raw web session
data into meaningful representations and a density-
based clustering algorithm that groups comparable
sessions. In addition, a session-based outlier iden-
tification approach is used to identify sessions that
differ considerably from existing clusters (Sun et al.,
2020). Another study presents a comprehensive ap-
proach for identifying abnormalities in web session
data, particularly in dynamic environments. The au-
thors discuss the issue of recognizing anomalous be-
haviors that may occur as a result of the ever-changing
nature of web applications. The suggested method
incorporates both web usage mining and clustering
techniques to efficiently capture abnormal patterns.
It adopts a feature extraction approach that includes
multiple session-based and sequence-based elements
to describe web sessions thoroughly (Guti
´
errez et al.,
2010). Stenovic et al. suggests a method for unsu-
pervised clustering of web sessions to distinguish be-
tween malicious and non-malicious website sessions.
This paper addresses the challenge of identifying po-
tential threats and abnormal behaviors without rely-
ing on labeled data or prior knowledge of attack pat-
terns. The proposed approach combines feature en-
gineering techniques with a clustering algorithm to
group similar web sessions together. The method
aims to differentiate between legitimate user behav-
ior and malicious activities by capturing the inher-
ent patterns and similarities within the session data
(Stevanovic et al., 2011). Additionally, Sadeghpour
et al. present another unsupervised approach using
the SOM algorithm as a clustering method with an
automated feature selection method utilizing gradient
boosting (Sadeghpour et al., 2021).
On the other hand, several researchers focus on
supervised techniques over web session data. For in-
stance, Goseva et al. struggle to identify and distin-
guish between legitimate and malicious web sessions.
The suggested method utilizes machine learning tech-
niques and feature engineering to extract meaningful
information from web session data. The researchers
employ a variety of features such as session duration,
request types, and frequency to represent each ses-
sion. These features are then used to train a clas-
sification model capable of accurately categorizing
web sessions as malicious or non-malicious (Goseva-
Popstojanova et al., 2012). In addition to this work,
other approaches exist to determine whether a web
session was created by a legitimate user. Suchacka
et al. focus on separating sessions by creators, ei-
NCTA 2023 - 15th International Conference on Neural Computation Theory and Applications
488
Table 1: Comparative Analysis: Proposed Approach vs. Related Work.
Research Date SL UL Approach Dataset
Sun et al. (Sun et al., 2020) 2020 X Clustering (DBSCAN) Two Different Web Sites Data
Guti
´
errez et al. (Guti
´
errez
et al., 2010)
2010 X X Web Usage Mining, Cluster-
ing, and Graph Mining
Synthetic Data
Stevanovic et al. (Stevanovic
et al., 2011)
2011 X Clustering (SOM and Modi-
fied ART2)
Dataset obtained using Web-
Crawlers
Goseva et al. (Goseva-
Popstojanova et al., 2012)
2012 X Decision Tree (J48 and
PART)
WebDBAdmin and Web 2.0 Datasets
collected using these honeypots
Sadeghpour et al. (Sadeghpour
et al., 2021)
2021 X Clustering (SOM) Dataset provided by the York Univer-
sity’s EECS
Suchacka et al. (Suchacka and
Iwa
´
nski, 2020)
2020 X Clustering (aIB Algorithm) Dataset obtained by real e-commerce
website
Proposed Approach 2023 X Random Forest Dataset obtained by real banking
website, CSIC 2010, and WAF
Dataset
ther users or bots, utilizing an agglomerative Infor-
mation Bottleneck (aIB) algorithm that aims to ex-
tract the most informative features from the traffic
data while minimizing redundancy (Suchacka and
Iwa
´
nski, 2020).
The comparison of the suggested approach and
other studies is depicted in Table 1. This paper intro-
duces a series of distinctive advancements and contri-
butions that set it apart from existing studies on mali-
cious session detection.
To begin with, the paper proposes a novel ap-
proach that specifically targets the detection of
malicious sessions. This innovative method ad-
dresses the intricate task of extracting discrimi-
native information from both malicious and be-
nign sessions through the utilization of web re-
quests, surmounting this challenge through the
implementation of an embedding layer.
Moreover, this study pioneers the application of
the proposed approach to analyze web session
data sourced from the banking industry, an arena
that has been relatively underexplored in the con-
text of malicious session detection. The efficacy
of the introduced approach is rigorously assessed
using a diverse set of benchmark datasets, effec-
tively showcasing its substantial practical poten-
tial.
3 DATASETS
Every interaction directed at a web server leaves a
trace in the form of chronological web request logs.
These logs play a pivotal role later in identifying
irregularities within web-based activities, offering a
wealth of insights into these interactions. By dili-
gently monitoring and analyzing these records, orga-
nizations can uncover anomalies indicative of illicit
or harmful behavior. For instance, the sudden appear-
ance of repeated requests for previously unutilized
resources by a user may signal nefarious intentions
(Toprak and Yavuz, 2022).
A unique dataset sourced from Yapı Kredi
Teknoloji (YKT) serves as the foundational dataset
for evaluating the proposed model, herein referred
to as the “Banking Dataset. This collection encom-
passes web request logs capturing incoming network
traffic to an online banking platform. Comprising
over 2 million benign requests and more than 10,000
malicious requests, this dataset employs URI and Ses-
sionID attributes for malicious web session identifi-
cation. The CSIC 2010 HTTP dataset, assembled
by the “Information Security Institute” of the Span-
ish National Council for Research (CSIC), consti-
tutes the second evaluation dataset. With more than
25,000 malicious requests and 36,000 legitimate re-
quests, this dataset is designed to facilitate assess-
ments of web attack defense systems (Gim
´
enez et al.,
2012). The third dataset, known as the WAF dataset,
is an open-source compilation sourced from 30 dif-
ferent Web Application Firewalls (WAFs). It encom-
passes over 1 million benign requests and more than
40,000 malicious requests, aimed at fostering the de-
velopment of machine learning models capable of de-
tecting and categorizing WAF attacks (Ahmad, 2017).
In these datasets, individual requests aggregate to
form user sessions. In the banking dataset, user ses-
MASD: Malicious Web Session Detection Using ML-Based Classifier
489
Table 2: Dataset Instances (B: benign M: malicious).
Path Dataset B M
/ngi/index.do Banking X
/core/.env Banking X
/tienda1/index.jsp CSIC2010 X
/iisstart.htm CSIC2010 X
/blueberry WAF X
/ows-bin/windmail.exe WAF X
sions are constructed using the URI and SessionID
fields, while other datasets employ a random selec-
tion process for user session formation. To effectively
measure the performance of the model under evalu-
ation, a deliberate imbalance is created by following
a 20:100 ratio between malicious and non-malicious
sessions, as observed in previous studies (Sadeghpour
et al., 2021). This choice is influenced by the inher-
ent rarity of malicious sessions in the data, constitut-
ing approximately 1% of the total sessions (Sun et al.,
2020). Consequently, when the proposed model cor-
rectly classifies benign data, it attains an accuracy rate
of 99%. However, this seemingly impressive accu-
racy can be misleading when evaluating both bench-
mark models and the proposed model, as it doesn’t
adequately address the challenge of accurately detect-
ing the rare and critical malicious sessions within the
dataset. Detailed instances of these session propor-
tions are presented in Table 2.
4 USE CASE SCENERIOS
Many users prefer to use banking web applications for
their financial transactions. These transactions consist
of money transfers, loan or credit card applications,
and various payments. A user’s session in web ap-
plications begins by performing a request, such as a
login. Then, this session is ended by user’s leave. If
no malicious activity occurs during a user session, this
session is labeled as benign. Otherwise, it is a mali-
cious session. Due to the large and dynamic web en-
vironment, low-volume attacks, and ever-developing
attack techniques, detection via rule-based systems is
inappropriate. Resolving this contradiction could be
possible with ML-based malicious session detection
systems if the accuracy and true positive rate could be
increased.
Use Case Scenario 1: Initially, an arbitrary user
logs into the website of the bank. Then, they control
the balances of their accounts. They want to pay off
their credit card debt. They open the credit cards page
and click the button for payments. They fill out the
amount. Inadvertently, a key on the numeric pad is
stuck and a large amount is typed. They click the but-
ton for payment operation and create a request to the
system. The rule-based prevention system of the bank
interrupts the operation and forces a log-out. How-
ever, all activities of this session are legal yet a mis-
take exists. In this case, MASD should provide the
correct prediction. Otherwise, interruption of pay-
ments will lead to the loss of trust and reputation.
Use Case Scenario 2: An attacker logs into the
banking website akin to an arbitrary user. He checks
balances and credit cards. Then, he wants to trans-
fer money from this account to another account. He
opens the transfer page and fills out all details on this
page. However, he traces the request for transfer op-
eration and manually changes destination and source
accounts. The updated request is sent to the web ap-
plication several times. In this case, all requests are
passed by a rule-based prevention system as they do
not contain large amounts or illegal characters. How-
ever, MASD should be able to catch such a session as
malicious. Otherwise, financial loss occurs.
5 THE MASD ARCHITECTURE
The focus of this paper lies in introducing a novel
approach designed to identify malicious web ses-
sions. The suggested approach (MASD: Malicious
Web Session Detection) is a supervised approach that
uses the Random Forest algorithm as a classifier with
an embedding layer. This approach aims to obtain
high accuracy and true positive rate despite the ex-
pansive and dynamic nature of the web, low-volume
attacks, and data imbalances. Web application attacks
are largely preventable with this approach.
The MASD architecture, shown in Fig 1, con-
tains a classifier and embedding layer. The Ran-
dom Forest algorithm is utilized as a classifier
in this approach. The hyperparameters are de-
fined as n
estimators
= 10 and random
state
= 2. Be-
fore the data is given to the embedding layer, it
is preprocessed using code embedding (Vector
x
=
CodeEmbedding(X
input
)) and is grouped by sessions
(Vector
s
= GroupBySession(Vector
x
)). The session
list is an input for the embedding layer (X
out put
=
EmbeddingLayer(Vector
s
)). An embedding layer
plays a vital role in classification tasks by trans-
forming categorical or discrete data into dense vec-
tor representations. It captures semantic relation-
ships, reduces dimensionality, encodes latent features,
enhances generalization and robustness, and enables
transfer learning. As a result of applying this layer,
high accuracy and true positive rate in experimental
trials are achieved. The output of the embedding layer
is given to a classifier. This algorithm of this approach
NCTA 2023 - 15th International Conference on Neural Computation Theory and Applications
490
Figure 1: The MASD Architecture: The web requests are given as input for the preprocessing step. All requests are converted
to a vector of ASCII codepoints. Next, this data is grouped by sessions. The session list is given to an embedding layer. This
layer’s output, a dense vector list, is given to the classifier to predict labels for sessions.
is presented in Algorithm 1. It is used as a classifier
to predict sessions as malicious or benign. The Ran-
dom Forest algorithm is an ensemble learning method
that combines multiple decision trees to make accu-
rate predictions. Each decision tree in the forest is
constructed independently, and the final prediction is
obtained through a voting or averaging process. At
the end of the classification, the predictions are evalu-
ated using evaluation metrics such as accuracy, recall,
precision, and F1 score.
Algorithm 1: MASD: Malicious Web Session
Detection.
Input: Dataset: Labeled dataset
Dataset = ((x
1
, y
1
), ..., (x
m+n
, y
m+n
)) with m + n
queries
X
m
, Y
m
: Malicious samples X
m
= (x
1
, ..., x
m
) and
labels Y
m
= (y
1
, ..., y
m
) with m queries
X
n
, Y
n
: Benign samples X
n
= (x
1
, ..., x
n
) and labels
Y
n
= (y
1
, ..., y
n
) with n queries
Output: Y
p
: Labels predicted by the classifier
Begin:
Initialize: X
m
[], Y
m
[], X
n
[], Y
n
[]
BenignQueries, MaliciousQueries seperate
Dataset
X
n
, Y
n
apply code embedding for
BenignQueries
X
m
, Y
m
apply code embedding for
MaliciousQueries
X
ms
, X
ns
group X
n
and X
m
by session
Y
ms
, Y
ns
set labels according to X
ms
, X
ns
X
train
, X
test
get samples using X
ms
, X
ns
X
train
, X
test
process X
train
, X
test
using
embedding layer
Y
train
, Y
test
get samples using Y
ms
, Y
ns
train classifier with X
train
, Y
train
Y
p
predict labels using classifier with X
test
End of algorithm
5.1 Data Preprocessing
MASD approach utilizes code embedding as data pre-
processing for web request logs (Jemal et al., 2020).
The Uniform Resource Identifier (URI) is the first in-
put. It consists of a string to identify a resource on
the Internet. Removing the hostname of this string
yields the path, the query and the fragment of the URL
(used as the path in the rest of the paper for brevity).
The code embedding provides that the characters of
the path are presented as vectors consisting of ASCII
codes. Owed to it, characters are grouped together
or compared according to how similar they are. This
method is preferred in machine translation and lan-
guage modeling as a useful approach (Jemal et al.,
2020). Table 3 depicts the conversion to ASCII code
values from paths. After obtaining the ASCII codes
list for each path, the requests are grouped by ses-
sion. Session ID values are utilized in the Banking
Dataset to prepare sessions. Each session includes a
maximum of one hundred requests. In other datasets,
sessions are prepared by selecting up to a hundred re-
quests.
5.2 Evaluation Metrics
This paper uses the accuracy, recall, precision, and F1
score as evaluation metrics. The prediction is identi-
fied as True Positive (TP) if a malicious session (Ac-
tual Positive) is labeled as a malicious session (Pre-
dicted Positive). True Negative (TN), False Positive
(FP), and False Negative (FN) represent other respec-
tive results in the same manner. The calculations of
evaluation metrics are shown with the following for-
mulas.
Accuracy =
T P +TN
T P +TN + FP + FN
(1)
Recall(TruePositiveRate) =
T P
T P +FN
(2)
MASD: Malicious Web Session Detection Using ML-Based Classifier
491
Table 3: Embedding of Instances.
Path ASCII Codepoint Vector Representation
/ngi/index.do [47, 110, 103, 105, 47, 105, 110, 100, 101, 120, 46, 100, 111]
/ows-bin/windmail.exe [47, 111, 119, 115, 45, 98, 105, 110, 47, 119, 105, 110, 100, 109, 97, 105, 108, 46, 101, 120, 101]
Precision =
T P
T P +FP
(3)
F1 =
2 × Precision ×Recall
Precision + Recall
=
2 × T P
2 × T P + FP + FN
(4)
6 BENCHMARK MODELS
SVM, Na
¨
ıve Bayes, CNN, Decision Tree, DBSCAN,
and SOM are implemented as the benchmark models
to evaluate the proposed approach. The experimental
trials are carried out on the same datasets.
Support Vector Machines (SVMs) are popular and
powerful class of supervised learning algorithms used
for classification and regression tasks. SVMs aim to
find the optimal hyperplane that separates different
classes or predicts continuous values with the maxi-
mum margin. They are particularly effective in han-
dling complex and high-dimensional data by trans-
forming it into a higher-dimensional feature space us-
ing kernel functions. The key strength of SVMs lies in
their ability to handle non-linear relationships by im-
plicitly mapping the data into a higher-dimensional
space. By maximizing the margin between classes,
SVMs promote generalization and robustness against
overfitting. They can handle both binary and multi-
class classification problems and provide continuous
predictions for regression tasks. However, SVMs can
be computationally expensive, require careful selec-
tion of hyperparameters, and their performance may
vary depending on the choice of the kernel function.
Nonetheless, SVMs remain widely used and valued
for their flexibility, robustness, and theoretical foun-
dation in machine learning (Azab et al., 2022; Cortes
and Vapnik, 1995). The hyperparameters are defined
as γ = 10
3
, C = 0.25, and tol = 10
-7
.
Na
¨
ıve Bayes (NB) is a notably uncomplicated yet
formidable probabilistic machine learning algorithm
employed extensively for classification endeavors.
Rooted in Bayes’ theorem, this algorithm operates
under the presumption that features are condition-
ally independent given the class variable, a concept
termed na
¨
ıve independence. This assumption sim-
plifies computation, facilitating efficient training and
prediction processes. Notable for their rapid learn-
ing pace and scalability, Na
¨
ıve Bayes models prove
advantageous for vast datasets. Their robust perfor-
mance, even when confronted with limited training
data and high-dimensional feature spaces, has estab-
lished their significance. Notably, Na
¨
ıve Bayes clas-
sifiers find widespread application in text classifica-
tion tasks like spam filtering or sentiment analysis,
where the independence assumption aligns well. Al-
though the naivety hypothesis might not hold true in
all scenarios, Na
¨
ıve Bayes classifiers consistently de-
liver competitive outcomes, embodying a dependable
and comprehensible baseline model for classification
challenges (Azab et al., 2022; Lewis, 1998).
Convolutional Neural Networks (CNNs) are a
subset of deep learning models built to be exception-
ally good at processing and evaluating visual data
such as images and videos. CNNs have transformed
computer vision tasks and attained cutting-edge per-
formance in several fields. The key component of
CNNs is the convolutional layer, which applies a set
of learnable filters to the input data, capturing local
patterns and spatial dependencies. This allows CNNs
to automatically learn hierarchical representations of
the input, starting from low-level features and grad-
ually building up to more complex and abstract fea-
tures. The pooling layers further downsample the
features, reducing the spatial dimensionality and ex-
tracting the most salient information. The stacked
convolutional and pooling layers create a powerful
feature extractor that is robust to variations in posi-
tion, rotation, and scale. CNNs are typically followed
by fully connected layers that perform classification
or regression tasks (Azab et al., 2022). Two con-
volution layers, a batch normalization layer, a max
pooling layer, and a fully connected layer comprise
the CNN model used for experimental trials. The
Epochs parameter is determined as 40, and the batch
size parameter is used as 20. Additionally, the Adam
optimizer is utilized with the following parameters;
learningRate = 0.0005, beta
1
= 0.9, beta
2
= 0.999,
and amsgrad = False.
Decision Tree is a supervised learning technique.
It aids in decision-making by breaking down com-
plex decisions into a series of simpler ones. The
algorithm divides data into subsets depending on
several features, much like the branches of a tree,
providing a clear path to solutions. Classifica-
tion trees are tree models where the target variable
can take a discrete range of values. The leaves
of these tree structures represent class labels ob-
NCTA 2023 - 15th International Conference on Neural Computation Theory and Applications
492
tained from the branches of the attributes (Goseva-
Popstojanova et al., 2012). Default hyperparame-
ters are used (criterion =“gini”, min
samplesSplit
= 2,
min
samplesLea f
= 1, and min
weightFractionLea f
= 0.0).
DBSCAN (Density-Based Spatial Clustering of
Applications with Noise) is an unsupervised tech-
nique that groups the data points according to their
proximity. Researchers in data mining and machine
learning frequently use the DBSCAN approach. DB-
SCAN performs well when all clusters are sufficiently
dense and well-separated by low-density regions (Sun
et al., 2020). The hyperparameters are used as ε =
0.19, min
samples
= 2.
Self-Organizing Map (SOM) is a popular unsuper-
vised machine learning (ML) technique in the data
exploration and visualization area. It creates a low-
dimensional representation of high-dimensional data
by matching a grid of nodes to the input information
over a predefined number of repetitions. Typically,
neurons are built into a single, rectangular or hexag-
onal 2-dimensional grid (called a node or reference
vector) (Sadeghpour et al., 2021). The hyperparame-
ters are num
rows
= 10, num
cols
= 10, max
mdistance
= 4,
max
learningRate
= 0.5, and max
steps
= 750.
All benchmark models are placed as a classifier in
the MASD architecture. Then, the experimental re-
sults are obtained for benchmark models. Validation
and training accuracy values are followed to avoid
overfitting. Overfitting occurs when the training accu-
racy continues to increase while the validation accu-
racy plateaus or diminishes. The stability of training
and validation accuracy values have been provided by
tuning hyperparameters. The descriptions of hyper-
parameters are given in Table 6.
7 RESULTS
The experimental trials are completed on an ordinary
computer using Python 3.8
1
and Tensorflow 2.10
2
to
implement all models.
7.1 Evaluation with Datasets
For evaluation in the banking domain, the sample ses-
sion sets are meticulously arranged in proportions of
20 malicious sessions to 100 benign sessions, ensur-
ing a balanced assessment. For all datasets, each ses-
sion contains between one and a hundred requests.
The proposed approach has achieved better results
1
Python Software Foundation. Available online: https:
//www.python.org/
2
TensorFlow Library. Available online: https://www.te
nsorflow.org
(0.9583) than the benchmark models over the banking
dataset. The SOM algorithm has obtained the second-
best accuracy (0.9167). Decision Tree has achieved
an accuracy of 0.9083. The accuracy of CNN and
SVM is 0.8333. NB has better accuracy (0.8417)
than CNN and SVM. Additionally, the DBSCAN al-
gorithm has the lowest accuracy of 0.7000.
The sample session sets are generated by using
malicious and benign queries for WAF and CSIC
2010 datasets. After compiling malicious and benign
sessions, the sample sets of these sessions are pre-
pared with 20:100 proportions (20 malicious sessions
and 100 benign sessions). The proposed approach
has outperformed benchmark models over CSIC2010
data. The suggested technique has an accuracy of
0.9667, while CNN and SVM have an accuracy of
0.8333. Additionally, NB and DBSCAN have an ac-
curacy of 0.7833, and 0.7417, respectively. The SOM
algorithm has an accuracy of 0.9417. Over the WAF
dataset, the suggested technique has the highest ac-
curacy (0.9917). The accuracy values of both SVM,
CNN, and NB are 0.8333, 0.8417, and 0.6583, re-
spectively. Additionally, DBSCAN has an accuracy
of 0.7500. SOM and Decision Tree are very promis-
ing, with accuracy rates of 0.9833 and 0.9750. These
results are presented in Table 4.
7.2 Summary
Upon analysis of the results presented in Table 4, it
becomes evident that the proposed model exhibits a
notable performance superiority over its rival method-
ologies across all examined datasets. The tech-
nique securing the second-highest accuracy, Self-
Organizing Maps (SOM), falls short by a margin
ranging from 1% to 4% when compared to the pro-
posed model’s accuracy. Similarly, the Decision Tree
method, ranking third in accuracy, lags behind by
approximately 2% to 7% across the datasets. The
Support Vector Machine (SVM) maintains a consis-
tent accuracy value across all datasets, yet this value
remains consistently 12% to 16% lower than that
achieved by the proposed model. The Naive Bayes
(NB) approach registers a significantly diminished
performance, attaining an accuracy of 11% to 30%
lower than that of the proposed model. Likewise,
the Convolutional Neural Network (CNN) yields ac-
curacy figures that are consistently 12% to 16% be-
low the proposed model’s accuracy. Furthermore,
the Density-Based Spatial Clustering of Applications
with Noise (DBSCAN) demonstrates an accuracy
shortfall of 22% in comparison to the proposed ap-
proach. These findings collectively underscore the ro-
bustness and efficacy of the proposed model in deliv-
MASD: Malicious Web Session Detection Using ML-Based Classifier
493
ering superior results across diverse datasets.
Table 4: Results of the Experiments (Top two results of each
metric are bolded, precedence of MASD is certain).
Method Metric Banking CSIC2010 WAF
MASD
with RF
(Proposed)
Accuracy 0.9583 0.9667 0.9917
Recall 0.9583 0.9667 0.9917
Precision 0.9594 0.9662 0.9917
F1 Score 0.9587 0.9660 0.9916
SVM
Accuracy 0.8333 0.8333 0.8333
Recall 0.8333 0.8333 0.8333
Precision 0.6944 0.6944 0.6944
F1 Score 0.7576 0.7576 0.7576
Na
¨
ıve
Bayes
Accuracy 0.8417 0.7833 0.6583
Recall 0.8417 0.7833 0.6583
Precision 0.9188 0.8933 0.8880
F1 Score 0.8589 0.8092 0.7008
CNN
Accuracy 0.8333 0.8333 0.8417
Recall 0.8333 0.8383 0.8417
Precision 0.6944 0.6944 0.8669
F1 Score 0.7576 0.7576 0.7769
Decision
Tree
Accuracy 0.9083 0.8917 0.9750
Recall 0.9083 0.8917 0.9750
Precision 0.9409 0.9343 0.9783
F1 Score 0.9156 0.9012 0.9757
DBSCAN
Accuracy 0.7000 0.7417 0.7500
Recall 0.7000 0.7417 0.7500
Precision 0.7901 0.8521 0.8241
F1 Score 0.7317 0.7720 0.7748
SOM
Accuracy 0.9167 0.9417 0.9833
Recall 0.9167 0.9417 0.9833
Precision 0.9133 0.9455 0.9837
F1 Score 0.9105 0.9365 0.9830
8 CONCLUSIONS
In the realm of web session security, this paper repre-
sents a significant stride forward by presenting a ma-
chine learning-grounded solution for the detection of
malicious web sessions. The study introduces a novel
classifier-driven methodology that capitalizes on the
potency of machine learning algorithms to discern
and classify malicious web sessions. This approach
is applied across three distinct datasets, underscoring
its versatility and potential effectiveness. Notably, the
approach stands out by achieving a remarkable ac-
curacy rate, dispelling the need for the often restric-
tive feature extraction phase that typically extracts a
limited array of features. The recorded results strik-
ingly surpass the 99% mark across all assessed met-
rics. These outcomes collectively highlight the poten-
tial efficacy of the machine learning-based classifier
approach in fortifying web session security. Further-
more, the groundwork laid by this research paves the
way for future extensions, possibly encompassing the
classification of diverse user sessions and a deeper ex-
ploration of user behavior patterns.
REFERENCES
Ahmad, F. (2017). WAF malicious queries data sets. [On-
line]. Available: https://web.archive.org/web/2023
0301151428/https://github.com/faizann24/Fwaf-Ma
chine-Learning-driven-Web-Application-Firewall/.
(Accessed on 01 March 2023).
Azab, A., Khasawneh, M., Alrabaee, S., Choo, K.-K. R.,
and Sarsour, M. (2022). Network traffic classification:
Techniques, datasets, and challenges. Digital Commu-
nications and Networks.
Cortes, C. and Vapnik, V. (1995). Support-vector networks.
Machine learning, 20:273–297.
Gim
´
enez, C. T., Villegas, A. P., and Mara
˜
n
´
on, G.
´
A. (2012).
Information Security Institute, HTTP DATASET
CSIC 2010. [Online]. Available: https://www.tic.itef
i.csic.es/dataset/. (Accessed on 18 December 2014).
Goseva-Popstojanova, K., Anastasovski, G., and Pantev,
R. (2012). Classification of malicious Web sessions.
In 2012 21st International Conference on Computer
Communications and Networks (ICCCN), pages 1–9.
IEEE.
Guti
´
errez, M. G.-C., Pongilupi, J. V., and LLin
`
as, M. M.
(2010). Web sessions anomaly detection in dynamic
environments. In ISSE 2009 Securing Electronic Busi-
ness Processes: Highlights of the Information Secu-
rity Solutions Europe 2009 Conference, pages 216–
220. Springer.
Jemal, I., Haddar, M. A., Cheikhrouhou, O., and Mah-
foudhi, A. (2020). Malicious HTTP request detection
using code-level convolutional neural network. In In-
ternational Conference on Risks and Security of Inter-
net and Systems, pages 317–324. Springer.
Lewis, D. D. (1998). Naive (Bayes) at forty: The indepen-
dence assumption in information retrieval. In Machine
Learning: ECML-98: 10th European Conference on
Machine Learning Chemnitz, Germany, April 21–23,
1998 Proceedings 10, pages 4–15. Springer.
Mansfield-Devine, S. (2022). Verizon: Data Breach Inves-
tigations Report.
PurpleSec (2022). Cyber Security Statistics: The Ultimate
List Of Stats Data, & Trends For 2022. [Online].
Available: https://web.archive.org/web/20221205
155455/https://purplesec.us/resources/cyber-securit
y-statistics/. (Accessed on 5 December 2022).
Sadeghpour, S., Vlajic, N., Madani, P., and Stevanovic, D.
(2021). Unsupervised ML Based Detection of Ma-
licious Web Sessions with Automated Feature Selec-
tion: Design and Real-World Validation. In 2021
IEEE 18th Annual Consumer Communications & Net-
working Conference (CCNC), pages 1–9. IEEE.
Stevanovic, D., Vlajic, N., and An, A. (2011). Unsuper-
vised clustering of Web sessions to detect malicious
NCTA 2023 - 15th International Conference on Neural Computation Theory and Applications
494
and non-malicious website users. Procedia Computer
Science, 5:123–131.
Suchacka, G. and Iwa
´
nski, J. (2020). Identifying legitimate
Web users and bots with different traffic profiles—an
Information Bottleneck approach. Knowledge-Based
Systems, 197:105875.
Sun, Y., Xie, Y., Wang, W., Zhang, S., Gao, J., and Chen,
Y. (2020). WSAD: An Unsupervised Web Session
Anomaly Detection Method. In 2020 16th Interna-
tional Conference on Mobility, Sensing and Network-
ing (MSN), pages 735–739. IEEE.
Toprak, S. and Yavuz, A. (2022). Web Application Firewall
Based on Anomaly Detection using Deep Learning. In
ACTA INFOLOGICA 2022, pages 142–149.
APPENDIX
Table 5 shows the abbreviations utilized in this paper.
The descriptions mentioned in this paper are
shown in Table 6.
Table 5: ABBREVIATIONS.
Abbreviation Definition
B Benign
CNN Convolutional Neural Network
FN False Negative
FP False Positive
HTTP Hypertext Transfer Protocol
M Malicious
ML Machine Learning
RF Random Forest
SL Supervised Learning
SOM Self-Organizing Map (SOM)
SVM Support Vector Machine
TN True Negative
TP True Positive
TPR True Positive Rate
UL Unsupervised Learning
URI Uniform Resource Identifier
WAF Web Application Firewall
WSAD Web Session Anomaly Detection
YKT Yapı Kredi Technology
Table 6: Descriptions of Hyperparameters.
Method Hyperparameter Description
RF
n
estimators
The number of trees in the forest
random
state
The randomization parameter for the bootstrapping of data points used in creating trees
and sampling features to use when searching for the optimum split across all nodes
SVM
γ The coefficient of the kernel
C The parameter for regularization
tol Criteria for terminating tolerance
CNN
learningRate The learning rate
beta
1
Exponential decay rate of first-moment estimates
beta
2
Exponential decay rate of second-moment estimates
amsgrad Whether or not to use the AMSGrad variation
Decision
Tree
criterion The function for determining the quality of a split
min
samplesS plit
The bare minimum of samples needed to separate an inner node
min
samplesLea f
The bare minimum of samples necessary at a leaf node
min
weightFractionLea f
The weighted least proportion of the total weights is essential to be at a leaf node
DBSCAN
ε The shortest distance among two instances for one to be deemed in the vicinity of the other
min
samples
The number of instances that must exist in the vicinity for a point to be called a core point
SOM
num
rows
The number of rows in the grid
num
cols
The number of columns in the grid
max
mdistance
Max value calculated using the Manhattan distance between two grid points
max
learningRate
Maximum learning rate
max
steps
The step size for training iteration
MASD: Malicious Web Session Detection Using ML-Based Classifier
495