Table 4: Vector of observations OV
0
. Same Solution x—shown in Table 1—which does not violate the constraint.
OV
0
0 0 0 0 0 0 1 40 0 0 5 0 0 0 0 1 0 25 0 13 0 0 0 2 0 3 30 0
AE ∗ x − OV
0
0 0 0 0 0 0 -6 -40 0 0 -5 0 0 0 0 -1 0 -25 0 -13 0 0 0 -2 0 0 -30 0
Table 5: Second Phase. A Subset of intrusions S
2
that violates constraint with subset S
1
found by iterative process.
S
1
21 48 69 96 117 144 165 192 213 240 261 288 309 336 357 384 405 432 453 480 501 549
S
2
528 576 597 624 645 672 693 720 741 768 789 816 837 864 885 912 933 960 981 - - -
plexity is higher by O(sg/h).
The space complexity for the NN can be consid-
ered as O (nm) because it needs to store the AE matrix,
and the OV and x vectors. The GA, besides previous
structures, needs to store the population that is of or-
der O(sl). So the GA space complexity is higher in
O(sl) than the NN space complexity.
5 CONCLUSIONS AND FUTURE
WORK
Two paradigms were tested with the misuse detection
problem in audit trail files. As some intrusions share
the same types of events, the possible solution x is
such that some x
i
are dependent, which makes the
genetic algorithm paradigm more suited for solving
this problem. However, the quality of the solution ob-
tained with the GA has a higher computational com-
plexity cost of O(sg/h)—population size by the ratio
of number of generations over the NN iterations—and
space complexity cost of O(sl)—population size by
length of x—with respect to the NN.
The GA has the advantage of discriminating an
intrusion from a non-intrusion as the solution of the
problem is encoded as 1 (intrusion) and 0 (non-
intrusion). As the range of values of x
i
for the NN
are such that x
i
≥ 0 the values of intrusions are input
dependent—depending on the observed vector OV .
However, at least for this test set, non-intrusions are
variables x
i
that converge to 0 or to values ≈ 0 when
in the initial conditions x is zero.
For the test set defined in this paper, there were no
false positives, except if we consider the NN without
the second phase (see Section 4) or if the initial con-
ditions change—see Section 2. For the false negative
side, if we look at the two sets S
1
and S
2
—see Section
3, the GA has in average (over 30 runs) of 39.14%
false negatives, and the NN has 60.95%. However,
the set S
2
can have exclusive intrusions, so the process
can continue until we get a set of mutually exclusive
subsets whose union is S (Diaz-Gomez and Hougen,
2007).
In order to improve the false negative ratio of the
GA, it is possible that by increasing the population
size (s > 1,000) the ratio is going to decrease; how-
ever, it is possible that the number of generations g
should be considered too, independently or in con-
junction with the population size. For the case of the
NN, it is a more challenging problem to try to dimin-
ish the false negative ratio. After the convergence of
all x
i
’s there is no improvement in the solution x, if
the number of iterations h is higher.
REFERENCES
Diaz-Gomez, P. A. and Hougen, D. F. (2005a). Analysis and
mathematical justification of a fitness function used in
an intrusion detection system. In Proceedings of the
Genetic and Evolutionary Computation Conference,
pages 1591–1592.
Diaz-Gomez, P. A. and Hougen, D. F. (2005b). Improved
off-line intrusion d etection using a genetic algorithm.
In Proceedings of the 7th International Conference on
Enterprise Information Systems, pages 66–73.
Diaz-Gomez, P. A. and Hougen , D. F. (20 06). A genetic al-
gorithm approach for doing misuse detection in audit
trail files. In Proceedings of the CIC-2006 Interna-
tional Conference on Computing, pages 329–335.
Diaz-Gomez, P. A. and Hougen, D. F. (2007). Misuse detec-
tion: An iterative process vs. a genetic algorithm ap-
proach. In Proceedings of the 9th International Con-
ference on Enterprise Information Systems.
Ham, F. M. and Kostanic, I. (2001). Principles of Neuro-
computing for Science & Engineering. Mc Graw Hill.
M
´
e, L. (1993). Security audit trail analysis using genetic
algorithms. In Proceedings of the 12th. International
Conference on Computer Safety, Reliability, and Se-
curity, pages 329–340.
M
´
e, L. (1998). GASSATA, a genetic algorithm as an alter-
native tool for security audit trail analysis. In Proceed-
ings of the First International Workshop on the Recent
Advances in Intrusion Detection.
ICEIS 2007 - International Conference on Enterprise Information Systems
462