both supervised and unsupervised learning.
Experiments have been conducted using a valid
data-set containing over two million connections to
build a model with multiple features (e.g., duration,
protocol, ports, packets, flows, etc.), in order to iden-
tify the approach that most accurately detects traffic
anomalies in the target network. The performance of
the proposed approach has been evaluated for each of
the algorithms. Preliminary results show a promis-
ing solution with 99% and 94% of accuracy for the
supervised and unsupervised learning approaches re-
spectively (considering the best I-LADS version).
The remainder of this paper is structured as fol-
lows: Section 2 discusses related work. Section 3
presents I-LADS and details its architecture, method-
ology, algorithms used and performance. Section 4
details the experiments performed to train, test and se-
lect the most accurate anomaly detection model, and
show preliminary results and performance metrics as-
sociated to each algorithm used. Finally, conclusions
and perspective for future work are presented in Sec-
tion 5.
2 RELATED WORK
Network Flow Anomaly Detection has been widely
studied in the literature. Brauckhoff (Brauckhoff,
2010), for instance, uses histogram-based anomaly
detectors to pre-filter a set of suspicious flows and ap-
ply association rule mining to extract the flows that
have caused a malicious event. The model monitors
one of the following attributes: transport protocol,
source IP address, destination IP address, source port
number, destination port number, packets per flow,
bytes per flow, inter-arrival times, flow duration, and
TCP flags. Labels that identify when an anomaly
has happened are required for evaluating whether an
anomaly detection system is accurate or not.
The approach of (Attak et al., 2018) builds upon
this idea, and proposes a SHIELD architecture and
different ML algorithms to the detection of anomalies
using the Netflow traffic protocol. Authors compare
the performance of the different algorithms: one-class
SVM, Isolation Forest, Local Outlier Factor and DL
autoencoders. The autoencoders developed using DL
techniques have a very high potential, although the
fine tuning is necessary to stable results, obtaining an
accuracy score of 87%.
In addition, (Lopez-Martin et al., 2017) propose
an IoT network-based intrusion detection method that
is based on a conditional variational autoencoder that
integrates the intrusion labels inside the decoder lay-
ers. The method is able to recover missing features
from incomplete training datasets and provides a very
high accuracy in the reconstruction process, even for
categorical features with a high number of distinct
values. As a result, the proposed method is less
complex and provides better classification results than
other familiar classifiers.
The work of (Mirsky et al., 2018) refers to an
unsupervised plug and play NIDS that can learn to
detect attacks on the local network. The approach
uses autoencoders as an ensemble of neural networks
to collectively differentiate between normal and ab-
normal traffic patterns. It supports feature extraction
that tracks patterns of every network channel. Results
show the possibility of detecting various attacks with
a performance comparable to offline anomaly detec-
tors.
Furthermore, (Nguyen et al., 2019) present a
framework for detecting and explaining anomalies
in network traffic based on Variational Autoencoder;
and a gradient fingerprinting technique for explaining
anomalies. Results demonstrate an approach effective
in detecting different anomalies as well as identifying
fingerprints that are good representations of these var-
ious attacks
More recently, (Delplace et al., 2020) propose us-
ing ML and DL models for the detection of botnets
using Netflow datasets. In their work, authors gener-
ated new features extracting those from the Netflow
datasets with the objective to improve the models.
According to authors, the Random Forest (RF) classi-
fier has the best performance in 13 different scenarios
with an accuracy of more than 95%.
One of our previous work, (Gonzalez-Granadillo
et al., 2019) we proposed an approach for the analysis
of the behavior of IoT devices that generates few sig-
nals. This approach uses a one-class SVC algorithm
to detect network anomalies with features related to
IP addresses (IP source, IP destination, IP distance,
IP known and IP unknown). Our proposed approach
improves previous works by adding multiple features
to the analysis and using other learning algorithms to
detect anomalies in the network traffic. Results show
significant improvements in the performance of the
tool.
3 IMPROVED LIVE ANOMALY
DETECTION SYSTEM (I-LADS)
I-LADS results from an adaptation of Convolutional
Autoencoders, that instead of processing images to
check for anomalies, it processes NetFlow data cat-
egorize network traffic into normal or anomalous,
based on the modeling of multiple NetFlow param-
An Improved Live Anomaly Detection System (I-LADS) based on Deep Learning Algorithms
569