representative to an organization decision process
when choosing an IDS. It is also worth mentioning
that our approach has been proposed to
comparatively access two or more intrusion
detection systems (and not to measure the individual
capabilities of a single system). The paper is
organized as follows: section 2 presents some related
work. Section 3 describes the script proposed to
conduct the assessment. The results obtained with
Snort and Firestorm IDSs are presented in section 4.
Section 5 concludes the paper with some final
remarks and presents perspectives for future work.
2 RELATED WORK
This section presents the main approaches developed
up to this date to assess intrusion detection systems.
The criteria applied in this comparison were: type of
assessment, nature of background traffic generated
to perform the experiments and requirement of test
settings.
Regarding the type of assessment, it has been
observed that most of the approaches assess only the
IDSs’ signatures bases (Puketza et al., 1997;
Lippmann et al., 2000, Barber, 2001) which, in
addition to being exhaustive work considering the
size of these bases, generates a valid result only for a
short time period, because signatures are developed
by IDSs’ creators or even by users quite frequently.
Therefore, in order to consider the results of these
approaches reliable, it is necessary to run all the
experiments every time a new signature is released.
On the other hand, approaches such as the ones
proposed in this paper and in (Alessandri, 2000),
instead of testing the signature base, actually test the
IDS’s detection capabilities. By using them,
experiments must be re-run only when new detection
functionalities are added to the system itself.
The representation of background traffic is a
fundamental feature in the assessment of IDSs,
because it interferes directly in the results of some
tests, such as the ones on false positive rates and
scalability. There are approaches like (Lippmann et
al., 1998) and (Lippmann et al., 1999) that do not
describe how background traffic is composed. This
leads to the objection of results, especially in the
assessment of false positives, because there is no
way to assure if there are attacks inserted on this
traffic, nor there are ways to identify which reasons
might have led the IDSs to generate such results.
Except for the methodology proposed by
(Alessandri, 2000), all the others require some sort
of test setting to run the experiments. Approaches
such as (Puketza et al., 1997) and (Lippmann et al.,
1999) require complex test settings, with dozens of
stations (attackers, victims, evaluated systems,
traffic generators and traffic collectors), various
interconnectivity equipments (hub, switch and
routers), and even firewalls. These requirements in a
test setting often result in an impracticable choice,
due to the fact that they demand an extended time
period and a dedicated environment up until
completion of tests. Furthermore, the use of firewalls
prevents several attacks from being captured by the
IDSs, since they are blocked before reaching the
company’s internal network. The use of firewalls is
extremely crucial for any business and must be part
of every study which aims at assessing security
infrastructure. However, considering this specific
purpose, it is a factor that limits the assessment
process.
As a general rule, what is observed in the
approaches quoted is that they lack a proposal which
could be applied by organizations security staff. In
order to accomplish that, it is necessary to develop
an approach that presents well defined procedures,
that can be easily achievable, and that can fully
reflect the reality of the criteria assessed. The
approaches referenced here fail in these aspects. In
addition to not providing adequate documentation on
how some important tests were conducted, some of
these proposals have not yet been properly validated,
or do not offer the necessary means to be applied.
3 EVALUATION SCRIPT
The script is composed by five steps: selection of
attacks, selection of tools, generation of traffic
settings for evaluation, assembly of evaluation
environment and IDSs analysis.
3.1 Selection of Attacks
The goal is to select a set of attacks that present
distinct technical characteristics amongst them.
Instead of simply gathering a set of attacks, we
propose to select, by the end of this phase, attacks
whose detection is possible through different
existent mechanisms in an IDS. For instance, for an
IDS to be capable to detect an insertion attack, the
URL Encoding, it needs more than the capability of
analyzing an HTTP packet, because a content
decoding mechanism of the packet header is also
necessary. Similarly, to detect denial of service
attacks such as the teardrop a mechanism capable to
rebuild fragmented IP packets is required. Thus, the
attacks selected in this step present a unique set of
characteristics that allows the evaluation of different
NETWORK-BASED INTRUSION DETECTION SYSTEMS EVALUATION THROUGH A SHORT TERM
EXPERIMENTAL SCRIPT
55