MASD: Malicious Web Session Detection Using ML-Based Classiﬁer

Dilek Yılmazer Demirel

and Mehmet Tahir Sandıkkaya

Department of Computer Engineering, Istanbul Technical University, Istanbul, Turkey

Keywords:

Malicious Web Session Detection, Machine Learning, Classiﬁcation.

Abstract:

The development of web applications and services has resulted in an increase in security concerns, especially in

identifying malicious web session attacks. Malicious web sessions pose a signiﬁcant risk to users, potentially

resulting in data breaches, illegal access, and other malicious activities. This study presents an innovative tech-

nique for detecting malicious web sessions using a machine learning-driven classiﬁer. To examine the features

of web sessions, the suggested technique combines an embedding layer and machine learning approaches.

Three different datasets were used in the empirical studies to conﬁrm the effectiveness of the approach. They

include a unique compilation of Internet banking web request logs, provided by Yap Kredi Teknoloji, as well

as the well-known HTTP dataset CSIC 2010 and the publicly accessible WAF dataset. The experimental

results are compared to known approaches such as Random Forest, Convolutional Neural Networks (CNN),

Support Vector Machines (SVM), Na

ıve Bayes, Decision Trees, DBSCAN, and Self-Organizing Maps (SOM).

The actual ﬁndings demonstrate the superiority of the suggested technique, especially when Random Forest is

used as the chosen classiﬁer. The attained accuracy rate of 99.17% surpasses the comparison methodologies,

highlighting the approach’s ability to efﬁciently identify and block malicious web sessions.

1 INTRODUCTION

The digital interaction, communication, and com-

merce landscape has been profoundly reshaped by

the widespread integration of web-based applications.

As users increasingly navigate these virtual environ-

ments, their activities have become pivotal to mod-

ern life. However, this rapid technological evolution

has not only heralded unprecedented convenience but

also ushered in an era of heightened vulnerability. In

tandem with this progress, malicious actors have in-

geniously exploited the expanding digital infrastruc-

ture, thereby imperiling the security and privacy of

web users. In particular, attacks intended for web

sessions have emerged as persistent threats, thereby

necessitating the development of potent and stream-

lined strategies for their identiﬁcation and prevention

(Mansﬁeld-Devine, 2022; Toprak and Yavuz, 2022).

Cybersecurity metrics underscore the gravity of the

situation, revealing that a substantial 68% of web ap-

plications are susceptible to breaches involving sen-

sitive data (PurpleSec, 2022). Equally concerning is

the report indicating that web applications serve as the

conduit for 56% of all cyberattacks, 95% of which

https://orcid.org/0000-0002-4008-4478

https://orcid.org/0000-0002-9756-603X

are driven by ﬁnancial motives (Mansﬁeld-Devine,

2022). In the face of these mounting challenges,

conventional rule-based attack prevention systems are

found to be incapable of adapting to swiftly evolving

attack vectors. As a result, increasing number of re-

searchers have turned their attention to the realm of

Machine Learning (ML)-based solutions as a promis-

ing avenue for the detection of malicious HTTP ses-

sions, in pursuit of a more agile and effective defense

paradigm.

Clustering algorithms have garnered signiﬁcant

attention among researchers seeking effective solu-

tions for the detection of malicious web sessions. In

various studies, clustering algorithms such as DB-

SCAN, SOM, and Modiﬁed ART2 have been em-

ployed to distinguish between legitimate and mali-

cious sessions, often coupled with a feature extrac-

tion phase (Sun et al., 2020; Stevanovic et al., 2011;

Sadeghpour et al., 2021). During the feature extrac-

tion phase, these investigations extract a constrained

set of features from web sessions, which subsequently

serve as input for the clustering algorithms. This

method facilitates session-based clustering to discern

and isolate malicious sessions. Complementary to

these unsupervised methodologies, several supervised

approaches have also emerged. These endeavors not

Yılmazer Demirel, D. and Sandıkkaya, M.

MASD: Malicious Web Session Detection Using ML-Based Classiﬁer.

DOI: 10.5220/0012174800003595

In Proceedings of the 15th International Joint Conference on Computational Intelligence (IJCCI 2023), pages 487-495

ISBN: 978-989-758-674-3; ISSN: 2184-3236

487

only aim to classify web sessions into malicious and

non-malicious categories but also employ feature ex-

traction techniques akin to their unsupervised coun-

terparts (Goseva-Popstojanova et al., 2012). In the

context of this discourse, the present paper intro-

duces an innovative approach to the detection of ma-

licious web sessions utilizing a Machine Learning

(ML)-based classiﬁer. In a departure from conven-

tional techniques, this approach eschews the feature

extraction phase that typically involves a limited fea-

ture set. In this manner, this novel approach achieves

heightened accuracy and a reduced false-positive rate,

thereby fortifying its efﬁcacy in the domain of ﬁnan-

cial security. Major contributions are summarized in

the list below:

• Introducing and implementing a new approach to

detect malicious web sessions using the Random

Forest as a classiﬁer with the embedding layer, as

described in Section 4.

• Implementing benchmark models such as SVM,

ıve Bayes, CNN, Decision Tree, DBSCAN,

and SOM, as given in Section 6.

• Using proposed models and benchmarks on three

different datasets (a unique ﬁnancial dataset, the

CSIC 2010 HTTP dataset (Gim

enez et al., 2012),

and the WAF dataset (Ahmad, 2017)) to compare

performance results after the training and the test

steps of the model, as shown in Section 7.

The subsequent sections of this paper are struc-

tured as follows: Section 2 provides an overview of

related research, outlining current methodologies and

outcomes in the realm of malicious web session de-

tection. In Section 3, an exposition is provided re-

garding the datasets employed, accompanied by a de-

tailed elucidation of how the samples within these

datasets are categorized. Section 7 offers a presenta-

tion of experimental results and comparative analyses,

shedding light on the performance of the proposed ap-

proach in contrast to existing methods. Finally, the

conclusions and implications drawn from the ﬁndings

of this study are described in Section 7.2.

2 RELATED WORK

Malicious web session detection has attracted the at-

tention of several studies. Both supervised and unsu-

pervised approaches are experimented to detect ma-

licious sessions. In unsupervised approaches, dif-

ferent clustering algorithms together with a feature

extraction method are preferred. For instance, Sun

et al. proposes an unsupervised learning technique

called WSAD to detect anomalies utilizing web ses-

sion data. The authors realize the limits of standard

approaches that rely on labeled data and present an

alternative. The WSAD technique involves a feature

extraction mechanism that converts raw web session

data into meaningful representations and a density-

based clustering algorithm that groups comparable

sessions. In addition, a session-based outlier iden-

tiﬁcation approach is used to identify sessions that

differ considerably from existing clusters (Sun et al.,

2020). Another study presents a comprehensive ap-

proach for identifying abnormalities in web session

data, particularly in dynamic environments. The au-

thors discuss the issue of recognizing anomalous be-

haviors that may occur as a result of the ever-changing

nature of web applications. The suggested method

incorporates both web usage mining and clustering

techniques to efﬁciently capture abnormal patterns.

It adopts a feature extraction approach that includes

multiple session-based and sequence-based elements

to describe web sessions thoroughly (Guti

errez et al.,

2010). Stenovic et al. suggests a method for unsu-

pervised clustering of web sessions to distinguish be-

tween malicious and non-malicious website sessions.

This paper addresses the challenge of identifying po-

tential threats and abnormal behaviors without rely-

ing on labeled data or prior knowledge of attack pat-

terns. The proposed approach combines feature en-

gineering techniques with a clustering algorithm to

group similar web sessions together. The method

aims to differentiate between legitimate user behav-

ior and malicious activities by capturing the inher-

ent patterns and similarities within the session data

(Stevanovic et al., 2011). Additionally, Sadeghpour

et al. present another unsupervised approach using

the SOM algorithm as a clustering method with an

automated feature selection method utilizing gradient

boosting (Sadeghpour et al., 2021).

On the other hand, several researchers focus on

supervised techniques over web session data. For in-

stance, Goseva et al. struggle to identify and distin-

guish between legitimate and malicious web sessions.

The suggested method utilizes machine learning tech-

niques and feature engineering to extract meaningful

information from web session data. The researchers

employ a variety of features such as session duration,

request types, and frequency to represent each ses-

sion. These features are then used to train a clas-

siﬁcation model capable of accurately categorizing

web sessions as malicious or non-malicious (Goseva-

Popstojanova et al., 2012). In addition to this work,

other approaches exist to determine whether a web

session was created by a legitimate user. Suchacka

et al. focus on separating sessions by creators, ei-

NCTA 2023 - 15th International Conference on Neural Computation Theory and Applications

488

Table 1: Comparative Analysis: Proposed Approach vs. Related Work.

Research Date SL UL Approach Dataset

Sun et al. (Sun et al., 2020) 2020 X Clustering (DBSCAN) Two Different Web Sites Data

Guti

errez et al. (Guti

errez

et al., 2010)

2010 X X Web Usage Mining, Cluster-

ing, and Graph Mining

Synthetic Data

Stevanovic et al. (Stevanovic

et al., 2011)

2011 X Clustering (SOM and Modi-

ﬁed ART2)

Dataset obtained using Web-

Crawlers

Goseva et al. (Goseva-

Popstojanova et al., 2012)

2012 X Decision Tree (J48 and

PART)

WebDBAdmin and Web 2.0 Datasets

collected using these honeypots

Sadeghpour et al. (Sadeghpour

et al., 2021)

2021 X Clustering (SOM) Dataset provided by the York Univer-

sity’s EECS

Suchacka et al. (Suchacka and

Iwa

nski, 2020)

2020 X Clustering (aIB Algorithm) Dataset obtained by real e-commerce

website

Proposed Approach 2023 X Random Forest Dataset obtained by real banking

website, CSIC 2010, and WAF

Dataset

ther users or bots, utilizing an agglomerative Infor-

mation Bottleneck (aIB) algorithm that aims to ex-

tract the most informative features from the trafﬁc

data while minimizing redundancy (Suchacka and

Iwa

nski, 2020).

The comparison of the suggested approach and

other studies is depicted in Table 1. This paper intro-

duces a series of distinctive advancements and contri-

butions that set it apart from existing studies on mali-

cious session detection.

• To begin with, the paper proposes a novel ap-

proach that speciﬁcally targets the detection of

malicious sessions. This innovative method ad-

dresses the intricate task of extracting discrimi-

native information from both malicious and be-

nign sessions through the utilization of web re-

quests, surmounting this challenge through the

implementation of an embedding layer.

• Moreover, this study pioneers the application of

the proposed approach to analyze web session

data sourced from the banking industry, an arena

that has been relatively underexplored in the con-

text of malicious session detection. The efﬁcacy

of the introduced approach is rigorously assessed

using a diverse set of benchmark datasets, effec-

tively showcasing its substantial practical poten-

tial.

3 DATASETS

Every interaction directed at a web server leaves a

trace in the form of chronological web request logs.

These logs play a pivotal role later in identifying

irregularities within web-based activities, offering a

wealth of insights into these interactions. By dili-

gently monitoring and analyzing these records, orga-

nizations can uncover anomalies indicative of illicit

or harmful behavior. For instance, the sudden appear-

ance of repeated requests for previously unutilized

resources by a user may signal nefarious intentions

(Toprak and Yavuz, 2022).

A unique dataset sourced from Yapı Kredi

Teknoloji (YKT) serves as the foundational dataset

for evaluating the proposed model, herein referred

to as the “Banking Dataset.” This collection encom-

passes web request logs capturing incoming network

trafﬁc to an online banking platform. Comprising

over 2 million benign requests and more than 10,000

malicious requests, this dataset employs URI and Ses-

sionID attributes for malicious web session identiﬁ-

cation. The CSIC 2010 HTTP dataset, assembled

by the “Information Security Institute” of the Span-

ish National Council for Research (CSIC), consti-

tutes the second evaluation dataset. With more than

25,000 malicious requests and 36,000 legitimate re-

quests, this dataset is designed to facilitate assess-

ments of web attack defense systems (Gim

enez et al.,

2012). The third dataset, known as the WAF dataset,

is an open-source compilation sourced from 30 dif-

ferent Web Application Firewalls (WAFs). It encom-

passes over 1 million benign requests and more than

40,000 malicious requests, aimed at fostering the de-

velopment of machine learning models capable of de-

tecting and categorizing WAF attacks (Ahmad, 2017).

In these datasets, individual requests aggregate to

form user sessions. In the banking dataset, user ses-

MASD: Malicious Web Session Detection Using ML-Based Classiﬁer

489

Table 2: Dataset Instances (B: benign M: malicious).

Path Dataset B M

/ngi/index.do Banking X

/core/.env Banking X

/tienda1/index.jsp CSIC2010 X

/iisstart.htm CSIC2010 X

/blueberry WAF X

/ows-bin/windmail.exe WAF X

sions are constructed using the URI and SessionID

ﬁelds, while other datasets employ a random selec-

tion process for user session formation. To effectively

measure the performance of the model under evalu-

ation, a deliberate imbalance is created by following

a 20:100 ratio between malicious and non-malicious

sessions, as observed in previous studies (Sadeghpour

et al., 2021). This choice is inﬂuenced by the inher-

ent rarity of malicious sessions in the data, constitut-

ing approximately 1% of the total sessions (Sun et al.,

2020). Consequently, when the proposed model cor-

rectly classiﬁes benign data, it attains an accuracy rate

of 99%. However, this seemingly impressive accu-

racy can be misleading when evaluating both bench-

mark models and the proposed model, as it doesn’t

adequately address the challenge of accurately detect-

ing the rare and critical malicious sessions within the

dataset. Detailed instances of these session propor-

tions are presented in Table 2.

4 USE CASE SCENERIOS

Many users prefer to use banking web applications for

their ﬁnancial transactions. These transactions consist

of money transfers, loan or credit card applications,

and various payments. A user’s session in web ap-

plications begins by performing a request, such as a

no malicious activity occurs during a user session, this

session is labeled as benign. Otherwise, it is a mali-

cious session. Due to the large and dynamic web en-

vironment, low-volume attacks, and ever-developing

attack techniques, detection via rule-based systems is

inappropriate. Resolving this contradiction could be

possible with ML-based malicious session detection

systems if the accuracy and true positive rate could be

increased.

Use Case Scenario 1: Initially, an arbitrary user

logs into the website of the bank. Then, they control

the balances of their accounts. They want to pay off

their credit card debt. They open the credit cards page

and click the button for payments. They ﬁll out the

amount. Inadvertently, a key on the numeric pad is

stuck and a large amount is typed. They click the but-

ton for payment operation and create a request to the

system. The rule-based prevention system of the bank

interrupts the operation and forces a log-out. How-

ever, all activities of this session are legal yet a mis-

take exists. In this case, MASD should provide the

correct prediction. Otherwise, interruption of pay-

ments will lead to the loss of trust and reputation.

Use Case Scenario 2: An attacker logs into the

banking website akin to an arbitrary user. He checks

balances and credit cards. Then, he wants to trans-

fer money from this account to another account. He

opens the transfer page and ﬁlls out all details on this

page. However, he traces the request for transfer op-

eration and manually changes destination and source

accounts. The updated request is sent to the web ap-

plication several times. In this case, all requests are

passed by a rule-based prevention system as they do

not contain large amounts or illegal characters. How-

ever, MASD should be able to catch such a session as

malicious. Otherwise, ﬁnancial loss occurs.

5 THE MASD ARCHITECTURE

The focus of this paper lies in introducing a novel

approach designed to identify malicious web ses-

sions. The suggested approach (MASD: Malicious

Web Session Detection) is a supervised approach that

uses the Random Forest algorithm as a classiﬁer with

an embedding layer. This approach aims to obtain

high accuracy and true positive rate despite the ex-

pansive and dynamic nature of the web, low-volume

attacks, and data imbalances. Web application attacks

are largely preventable with this approach.

The MASD architecture, shown in Fig 1, con-

tains a classiﬁer and embedding layer. The Ran-

dom Forest algorithm is utilized as a classiﬁer

in this approach. The hyperparameters are de-

ﬁned as n

estimators

= 10 and random

state

= 2. Be-

fore the data is given to the embedding layer, it

is preprocessed using code embedding (Vector

CodeEmbedding(X

input

)) and is grouped by sessions

(Vector

= GroupBySession(Vector

)). The session

list is an input for the embedding layer (X

out put

EmbeddingLayer(Vector

)). An embedding layer

plays a vital role in classiﬁcation tasks by trans-

forming categorical or discrete data into dense vec-

tor representations. It captures semantic relation-

ships, reduces dimensionality, encodes latent features,

enhances generalization and robustness, and enables

transfer learning. As a result of applying this layer,

high accuracy and true positive rate in experimental

trials are achieved. The output of the embedding layer

is given to a classiﬁer. This algorithm of this approach

NCTA 2023 - 15th International Conference on Neural Computation Theory and Applications

490

Figure 1: The MASD Architecture: The web requests are given as input for the preprocessing step. All requests are converted

to a vector of ASCII codepoints. Next, this data is grouped by sessions. The session list is given to an embedding layer. This

layer’s output, a dense vector list, is given to the classiﬁer to predict labels for sessions.

is presented in Algorithm 1. It is used as a classiﬁer

to predict sessions as malicious or benign. The Ran-

dom Forest algorithm is an ensemble learning method

that combines multiple decision trees to make accu-

rate predictions. Each decision tree in the forest is

constructed independently, and the ﬁnal prediction is

obtained through a voting or averaging process. At

the end of the classiﬁcation, the predictions are evalu-

ated using evaluation metrics such as accuracy, recall,

precision, and F1 score.

Algorithm 1: MASD: Malicious Web Session

Detection.

Input: Dataset: Labeled dataset

Dataset = ((x

, y

), ..., (x

m+n

, y

m+n

)) with m + n

queries

, Y

: Malicious samples X

= (x

, ..., x

) and

labels Y

= (y

, ..., y

) with m queries

, Y

: Benign samples X

= (x

, ..., x

) and labels

= (y

, ..., y

) with n queries

Output: Y

: Labels predicted by the classiﬁer

Begin:

Initialize: X

← [], Y

← [], X

← [], Y

← []

BenignQueries, MaliciousQueries ← seperate

Dataset

, Y

← apply code embedding for

BenignQueries

, Y

← apply code embedding for

MaliciousQueries

, X

← group X

and X

by session

, Y

← set labels according to X

, X

train

, X

test

← get samples using X

, X

train

, X

test

← process X

train

, X

test

using

embedding layer

train

, Y

test

← get samples using Y

, Y

train classiﬁer with X

train

, Y

train

← predict labels using classiﬁer with X

test

End of algorithm

5.1 Data Preprocessing

MASD approach utilizes code embedding as data pre-

processing for web request logs (Jemal et al., 2020).

The Uniform Resource Identiﬁer (URI) is the ﬁrst in-

put. It consists of a string to identify a resource on

the Internet. Removing the hostname of this string

yields the path, the query and the fragment of the URL

(used as the path in the rest of the paper for brevity).

The code embedding provides that the characters of

the path are presented as vectors consisting of ASCII

codes. Owed to it, characters are grouped together

or compared according to how similar they are. This

method is preferred in machine translation and lan-

guage modeling as a useful approach (Jemal et al.,

2020). Table 3 depicts the conversion to ASCII code

values from paths. After obtaining the ASCII codes

list for each path, the requests are grouped by ses-

sion. Session ID values are utilized in the Banking

Dataset to prepare sessions. Each session includes a

maximum of one hundred requests. In other datasets,

sessions are prepared by selecting up to a hundred re-

quests.

5.2 Evaluation Metrics

This paper uses the accuracy, recall, precision, and F1

score as evaluation metrics. The prediction is identi-

ﬁed as True Positive (TP) if a malicious session (Ac-

tual Positive) is labeled as a malicious session (Pre-

dicted Positive). True Negative (TN), False Positive

(FP), and False Negative (FN) represent other respec-

tive results in the same manner. The calculations of

evaluation metrics are shown with the following for-

mulas.

Accuracy =

T P +TN

T P +TN + FP + FN

(1)

Recall(TruePositiveRate) =

T P

T P +FN

(2)

MASD: Malicious Web Session Detection Using ML-Based Classiﬁer

491

Table 3: Embedding of Instances.

Path ASCII Codepoint Vector Representation

/ngi/index.do [47, 110, 103, 105, 47, 105, 110, 100, 101, 120, 46, 100, 111]

/ows-bin/windmail.exe [47, 111, 119, 115, 45, 98, 105, 110, 47, 119, 105, 110, 100, 109, 97, 105, 108, 46, 101, 120, 101]

Precision =

T P

T P +FP

(3)

F1 =

2 × Precision ×Recall

Precision + Recall

2 × T P

2 × T P + FP + FN

(4)

6 BENCHMARK MODELS

SVM, Na

ıve Bayes, CNN, Decision Tree, DBSCAN,

and SOM are implemented as the benchmark models

to evaluate the proposed approach. The experimental

trials are carried out on the same datasets.

Support Vector Machines (SVMs) are popular and

powerful class of supervised learning algorithms used

for classiﬁcation and regression tasks. SVMs aim to

ﬁnd the optimal hyperplane that separates different

classes or predicts continuous values with the maxi-

mum margin. They are particularly effective in han-

dling complex and high-dimensional data by trans-

forming it into a higher-dimensional feature space us-

ing kernel functions. The key strength of SVMs lies in

their ability to handle non-linear relationships by im-

plicitly mapping the data into a higher-dimensional

space. By maximizing the margin between classes,

SVMs promote generalization and robustness against

overﬁtting. They can handle both binary and multi-

class classiﬁcation problems and provide continuous

predictions for regression tasks. However, SVMs can

be computationally expensive, require careful selec-

tion of hyperparameters, and their performance may

vary depending on the choice of the kernel function.

Nonetheless, SVMs remain widely used and valued

for their ﬂexibility, robustness, and theoretical foun-

dation in machine learning (Azab et al., 2022; Cortes

and Vapnik, 1995). The hyperparameters are deﬁned

as γ = 10

, C = 0.25, and tol = 10

-7

ıve Bayes (NB) is a notably uncomplicated yet

formidable probabilistic machine learning algorithm

employed extensively for classiﬁcation endeavors.

Rooted in Bayes’ theorem, this algorithm operates

under the presumption that features are condition-

ally independent given the class variable, a concept

termed na

ıve independence. This assumption sim-

pliﬁes computation, facilitating efﬁcient training and

prediction processes. Notable for their rapid learn-

ing pace and scalability, Na

ıve Bayes models prove

advantageous for vast datasets. Their robust perfor-

mance, even when confronted with limited training

data and high-dimensional feature spaces, has estab-

lished their signiﬁcance. Notably, Na

ıve Bayes clas-

siﬁers ﬁnd widespread application in text classiﬁca-

tion tasks like spam ﬁltering or sentiment analysis,

where the independence assumption aligns well. Al-

though the naivety hypothesis might not hold true in

all scenarios, Na

ıve Bayes classiﬁers consistently de-

liver competitive outcomes, embodying a dependable

and comprehensible baseline model for classiﬁcation

challenges (Azab et al., 2022; Lewis, 1998).

Convolutional Neural Networks (CNNs) are a

subset of deep learning models built to be exception-

ally good at processing and evaluating visual data

such as images and videos. CNNs have transformed

computer vision tasks and attained cutting-edge per-

formance in several ﬁelds. The key component of

CNNs is the convolutional layer, which applies a set

of learnable ﬁlters to the input data, capturing local

patterns and spatial dependencies. This allows CNNs

to automatically learn hierarchical representations of

the input, starting from low-level features and grad-

ually building up to more complex and abstract fea-

tures. The pooling layers further downsample the

features, reducing the spatial dimensionality and ex-

tracting the most salient information. The stacked

convolutional and pooling layers create a powerful

feature extractor that is robust to variations in posi-

tion, rotation, and scale. CNNs are typically followed

by fully connected layers that perform classiﬁcation

or regression tasks (Azab et al., 2022). Two con-

volution layers, a batch normalization layer, a max

pooling layer, and a fully connected layer comprise

the CNN model used for experimental trials. The

Epochs parameter is determined as 40, and the batch

size parameter is used as 20. Additionally, the Adam

optimizer is utilized with the following parameters;

learningRate = 0.0005, beta

= 0.9, beta

= 0.999,

and amsgrad = False.

Decision Tree is a supervised learning technique.

It aids in decision-making by breaking down com-

plex decisions into a series of simpler ones. The

algorithm divides data into subsets depending on

several features, much like the branches of a tree,

providing a clear path to solutions. Classiﬁca-

tion trees are tree models where the target variable

can take a discrete range of values. The leaves

of these tree structures represent class labels ob-

NCTA 2023 - 15th International Conference on Neural Computation Theory and Applications

492

tained from the branches of the attributes (Goseva-

Popstojanova et al., 2012). Default hyperparame-

ters are used (criterion =“gini”, min

samplesSplit

= 2,

min

samplesLea f

= 1, and min

weightFractionLea f

= 0.0).

DBSCAN (Density-Based Spatial Clustering of

Applications with Noise) is an unsupervised tech-

nique that groups the data points according to their

proximity. Researchers in data mining and machine

learning frequently use the DBSCAN approach. DB-

SCAN performs well when all clusters are sufﬁciently

dense and well-separated by low-density regions (Sun

et al., 2020). The hyperparameters are used as ε =

0.19, min

samples

= 2.

Self-Organizing Map (SOM) is a popular unsuper-

vised machine learning (ML) technique in the data

exploration and visualization area. It creates a low-

dimensional representation of high-dimensional data

by matching a grid of nodes to the input information

over a predeﬁned number of repetitions. Typically,

neurons are built into a single, rectangular or hexag-

onal 2-dimensional grid (called a node or reference

vector) (Sadeghpour et al., 2021). The hyperparame-

ters are num

rows

= 10, num

cols

= 10, max

mdistance

= 4,

max

learningRate

= 0.5, and max

steps

= 750.

All benchmark models are placed as a classiﬁer in

the MASD architecture. Then, the experimental re-

sults are obtained for benchmark models. Validation

and training accuracy values are followed to avoid

overﬁtting. Overﬁtting occurs when the training accu-

racy continues to increase while the validation accu-

racy plateaus or diminishes. The stability of training

and validation accuracy values have been provided by

tuning hyperparameters. The descriptions of hyper-

parameters are given in Table 6.

7 RESULTS

The experimental trials are completed on an ordinary

computer using Python 3.8

and Tensorﬂow 2.10

implement all models.

7.1 Evaluation with Datasets

For evaluation in the banking domain, the sample ses-

sion sets are meticulously arranged in proportions of

20 malicious sessions to 100 benign sessions, ensur-

ing a balanced assessment. For all datasets, each ses-

sion contains between one and a hundred requests.

The proposed approach has achieved better results

Python Software Foundation. Available online: https:

//www.python.org/

TensorFlow Library. Available online: https://www.te

nsorﬂow.org

(0.9583) than the benchmark models over the banking

dataset. The SOM algorithm has obtained the second-

best accuracy (0.9167). Decision Tree has achieved

an accuracy of 0.9083. The accuracy of CNN and

SVM is 0.8333. NB has better accuracy (0.8417)

than CNN and SVM. Additionally, the DBSCAN al-

gorithm has the lowest accuracy of 0.7000.

The sample session sets are generated by using

malicious and benign queries for WAF and CSIC

2010 datasets. After compiling malicious and benign

sessions, the sample sets of these sessions are pre-

pared with 20:100 proportions (20 malicious sessions

and 100 benign sessions). The proposed approach

has outperformed benchmark models over CSIC2010

data. The suggested technique has an accuracy of

0.9667, while CNN and SVM have an accuracy of

0.8333. Additionally, NB and DBSCAN have an ac-

curacy of 0.7833, and 0.7417, respectively. The SOM

algorithm has an accuracy of 0.9417. Over the WAF

dataset, the suggested technique has the highest ac-

curacy (0.9917). The accuracy values of both SVM,

CNN, and NB are 0.8333, 0.8417, and 0.6583, re-

spectively. Additionally, DBSCAN has an accuracy

of 0.7500. SOM and Decision Tree are very promis-

ing, with accuracy rates of 0.9833 and 0.9750. These

results are presented in Table 4.

7.2 Summary

Upon analysis of the results presented in Table 4, it

becomes evident that the proposed model exhibits a

notable performance superiority over its rival method-

ologies across all examined datasets. The tech-

nique securing the second-highest accuracy, Self-

Organizing Maps (SOM), falls short by a margin

ranging from 1% to 4% when compared to the pro-

posed model’s accuracy. Similarly, the Decision Tree

method, ranking third in accuracy, lags behind by

approximately 2% to 7% across the datasets. The

Support Vector Machine (SVM) maintains a consis-

tent accuracy value across all datasets, yet this value

remains consistently 12% to 16% lower than that

achieved by the proposed model. The Naive Bayes

(NB) approach registers a signiﬁcantly diminished

performance, attaining an accuracy of 11% to 30%

lower than that of the proposed model. Likewise,

the Convolutional Neural Network (CNN) yields ac-

curacy ﬁgures that are consistently 12% to 16% be-

low the proposed model’s accuracy. Furthermore,

the Density-Based Spatial Clustering of Applications

with Noise (DBSCAN) demonstrates an accuracy

shortfall of 22% in comparison to the proposed ap-

proach. These ﬁndings collectively underscore the ro-

bustness and efﬁcacy of the proposed model in deliv-

MASD: Malicious Web Session Detection Using ML-Based Classiﬁer

493

ering superior results across diverse datasets.

Table 4: Results of the Experiments (Top two results of each

metric are bolded, precedence of MASD is certain).

Method Metric Banking CSIC2010 WAF

MASD

with RF

(Proposed)

Accuracy 0.9583 0.9667 0.9917

Recall 0.9583 0.9667 0.9917

Precision 0.9594 0.9662 0.9917

F1 Score 0.9587 0.9660 0.9916

SVM

Accuracy 0.8333 0.8333 0.8333

Recall 0.8333 0.8333 0.8333

Precision 0.6944 0.6944 0.6944

F1 Score 0.7576 0.7576 0.7576

ıve

Bayes

Accuracy 0.8417 0.7833 0.6583

Recall 0.8417 0.7833 0.6583

Precision 0.9188 0.8933 0.8880

F1 Score 0.8589 0.8092 0.7008

CNN

Accuracy 0.8333 0.8333 0.8417

Recall 0.8333 0.8383 0.8417

Precision 0.6944 0.6944 0.8669

F1 Score 0.7576 0.7576 0.7769

Decision

Tree

Accuracy 0.9083 0.8917 0.9750

Recall 0.9083 0.8917 0.9750

Precision 0.9409 0.9343 0.9783

F1 Score 0.9156 0.9012 0.9757

DBSCAN

Accuracy 0.7000 0.7417 0.7500

Recall 0.7000 0.7417 0.7500

Precision 0.7901 0.8521 0.8241

F1 Score 0.7317 0.7720 0.7748

SOM

Accuracy 0.9167 0.9417 0.9833

Recall 0.9167 0.9417 0.9833

Precision 0.9133 0.9455 0.9837

F1 Score 0.9105 0.9365 0.9830

8 CONCLUSIONS

In the realm of web session security, this paper repre-

sents a signiﬁcant stride forward by presenting a ma-

chine learning-grounded solution for the detection of

malicious web sessions. The study introduces a novel

classiﬁer-driven methodology that capitalizes on the

potency of machine learning algorithms to discern

and classify malicious web sessions. This approach

is applied across three distinct datasets, underscoring

its versatility and potential effectiveness. Notably, the

approach stands out by achieving a remarkable ac-

curacy rate, dispelling the need for the often restric-

tive feature extraction phase that typically extracts a

limited array of features. The recorded results strik-

ingly surpass the 99% mark across all assessed met-

rics. These outcomes collectively highlight the poten-

tial efﬁcacy of the machine learning-based classiﬁer

approach in fortifying web session security. Further-

more, the groundwork laid by this research paves the

way for future extensions, possibly encompassing the

classiﬁcation of diverse user sessions and a deeper ex-

ploration of user behavior patterns.

REFERENCES

Ahmad, F. (2017). WAF malicious queries data sets. [On-

line]. Available: https://web.archive.org/web/2023

0301151428/https://github.com/faizann24/Fwaf-Ma

chine-Learning-driven-Web-Application-Firewall/.

(Accessed on 01 March 2023).

Azab, A., Khasawneh, M., Alrabaee, S., Choo, K.-K. R.,

and Sarsour, M. (2022). Network trafﬁc classiﬁcation:

Techniques, datasets, and challenges. Digital Commu-

nications and Networks.

Cortes, C. and Vapnik, V. (1995). Support-vector networks.

Machine learning, 20:273–297.

Gim

enez, C. T., Villegas, A. P., and Mara

on, G.

A. (2012).

Information Security Institute, HTTP DATASET

CSIC 2010. [Online]. Available: https://www.tic.itef

i.csic.es/dataset/. (Accessed on 18 December 2014).

Goseva-Popstojanova, K., Anastasovski, G., and Pantev,

R. (2012). Classiﬁcation of malicious Web sessions.

In 2012 21st International Conference on Computer

Communications and Networks (ICCCN), pages 1–9.

IEEE.

Guti

errez, M. G.-C., Pongilupi, J. V., and LLin

as, M. M.

(2010). Web sessions anomaly detection in dynamic

environments. In ISSE 2009 Securing Electronic Busi-

ness Processes: Highlights of the Information Secu-

rity Solutions Europe 2009 Conference, pages 216–

220. Springer.

Jemal, I., Haddar, M. A., Cheikhrouhou, O., and Mah-

foudhi, A. (2020). Malicious HTTP request detection

using code-level convolutional neural network. In In-

ternational Conference on Risks and Security of Inter-

net and Systems, pages 317–324. Springer.

Lewis, D. D. (1998). Naive (Bayes) at forty: The indepen-

dence assumption in information retrieval. In Machine

Learning: ECML-98: 10th European Conference on

Machine Learning Chemnitz, Germany, April 21–23,

1998 Proceedings 10, pages 4–15. Springer.

Mansﬁeld-Devine, S. (2022). Verizon: Data Breach Inves-

tigations Report.

PurpleSec (2022). Cyber Security Statistics: The Ultimate

List Of Stats Data, & Trends For 2022. [Online].

Available: https://web.archive.org/web/20221205

155455/https://purplesec.us/resources/cyber-securit

y-statistics/. (Accessed on 5 December 2022).

Sadeghpour, S., Vlajic, N., Madani, P., and Stevanovic, D.

(2021). Unsupervised ML Based Detection of Ma-

licious Web Sessions with Automated Feature Selec-

tion: Design and Real-World Validation. In 2021

IEEE 18th Annual Consumer Communications & Net-

working Conference (CCNC), pages 1–9. IEEE.

Stevanovic, D., Vlajic, N., and An, A. (2011). Unsuper-

vised clustering of Web sessions to detect malicious

NCTA 2023 - 15th International Conference on Neural Computation Theory and Applications

494

and non-malicious website users. Procedia Computer

Science, 5:123–131.

Suchacka, G. and Iwa

nski, J. (2020). Identifying legitimate

Web users and bots with different trafﬁc proﬁles—an

Information Bottleneck approach. Knowledge-Based

Systems, 197:105875.

Sun, Y., Xie, Y., Wang, W., Zhang, S., Gao, J., and Chen,

Y. (2020). WSAD: An Unsupervised Web Session

Anomaly Detection Method. In 2020 16th Interna-

tional Conference on Mobility, Sensing and Network-

ing (MSN), pages 735–739. IEEE.

Toprak, S. and Yavuz, A. (2022). Web Application Firewall

Based on Anomaly Detection using Deep Learning. In

ACTA INFOLOGICA 2022, pages 142–149.

APPENDIX

Table 5 shows the abbreviations utilized in this paper.

The descriptions mentioned in this paper are

shown in Table 6.

Table 5: ABBREVIATIONS.

Abbreviation Deﬁnition

B Benign

CNN Convolutional Neural Network

FN False Negative

FP False Positive

HTTP Hypertext Transfer Protocol

M Malicious

ML Machine Learning

RF Random Forest

SL Supervised Learning

SOM Self-Organizing Map (SOM)

SVM Support Vector Machine

TN True Negative

TP True Positive

TPR True Positive Rate

UL Unsupervised Learning

URI Uniform Resource Identiﬁer

WAF Web Application Firewall

WSAD Web Session Anomaly Detection

YKT Yapı Kredi Technology

Table 6: Descriptions of Hyperparameters.

Method Hyperparameter Description

estimators

The number of trees in the forest

random

state

The randomization parameter for the bootstrapping of data points used in creating trees

and sampling features to use when searching for the optimum split across all nodes

SVM

γ The coefﬁcient of the kernel

C The parameter for regularization

tol Criteria for terminating tolerance

CNN

learningRate The learning rate

beta

Exponential decay rate of ﬁrst-moment estimates

beta

Exponential decay rate of second-moment estimates

amsgrad Whether or not to use the AMSGrad variation

Decision

Tree

criterion The function for determining the quality of a split

min

samplesS plit

The bare minimum of samples needed to separate an inner node

min

samplesLea f

The bare minimum of samples necessary at a leaf node

min

weightFractionLea f

The weighted least proportion of the total weights is essential to be at a leaf node

DBSCAN

ε The shortest distance among two instances for one to be deemed in the vicinity of the other

min

samples

The number of instances that must exist in the vicinity for a point to be called a core point

SOM

num

rows

The number of rows in the grid

num

cols

The number of columns in the grid

max

mdistance

Max value calculated using the Manhattan distance between two grid points

max

learningRate

Maximum learning rate

max

steps

The step size for training iteration

MASD: Malicious Web Session Detection Using ML-Based Classiﬁer

495