execution time. The elusive attack class to classify
was infiltration. One reason for this might be the se-
vere lack of positive training samples for this cate-
gory, a natural consequence of the low network foot-
print of infiltration attacks. Only k-nearest neighbors
(knn) came close to a decent precision score for this
class, but it lacked in recall. Despite its simplicity
knn is a potent classifier for five of the seven attack
classes. It is held back by its steep increase in exe-
cution time for larger data sets, even though it gener-
alizes as well as the tree-base meta-estimators. Op-
posite to this is the nearest centroid classifier, being
nigh insensitive to dataset size, while applicable for
the classification of port scan and DDoS traffic and for
perfect detection of brute force and DoS traffic. The
final algorithms, two support vector machines with
different kernels and logistic regression are useful for
recognition of port scan, DoS and DDoS traffic, pro-
vided the features are scaled (preferably normalized).
However, these classifiers are not favoured when pit-
ted against the tree-based classifiers, because the at-
tack classes on which they perform well are only a
subset of the classes for which the tree-based perform
equally well.
REFERENCES
Attak, H., Combalia, M., Gardikis, G., Gast
´
on, B., Jacquin,
L., Litke, A., Papadakis, N., Papadopoulos, D., and
Pastor, A. Application of distributed computing
and machine learning technologies to cybersecurity.
Space, 2:I2CAT.
Axelsson, S. (2000). Intrusion detection systems: A survey
and taxonomy.
blog, C. Inside the infamous mirai iot botnet: A retrospec-
tive analysis.
Brown, C., Cowperthwaite, A., Hijazi, A., and Somayaji,
A. (2009). Analysis of the 1999 darpa/lincoln labora-
tory ids evaluation data with netadhict. In Computa-
tional Intelligence for Security and Defense Applica-
tions, 2009. CISDA 2009. IEEE Symposium on, pages
1–7. IEEE.
Buczak, A. L. and Guven, E. (2016). A survey of data min-
ing and machine learning methods for cyber security
intrusion detection. IEEE Communications Surveys &
Tutorials, 18(2):1153–1176.
Denning, D. and Neumann, P. G. (1985). Requirements and
model for IDES-a real-time intrusion-detection expert
system. SRI International.
Hodo, E., Bellekens, X., Hamilton, A., Tachtatzis, C., and
Atkinson, R. (2017). Shallow and deep networks in-
trusion detection system: A taxonomy and survey.
arXiv preprint arXiv:1701.02145.
Marir, N., Wang, H., Feng, G., Li, B., and Jia, M.
(2018). Distributed abnormal behavior detection ap-
proach based on deep belief network and ensemble
svm using spark. IEEE Access, 6:59657–59671.
McHugh, J. (2000). The 1998 lincoln laboratory ids eval-
uation. In Debar, H., M
´
e, L., and Wu, S. F., editors,
Recent Advances in Intrusion Detection, pages 145–
161, Berlin, Heidelberg. Springer Berlin Heidelberg.
McKinney, W. pandas: a foundational python library for
data analysis and statistics.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer,
P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,
A., Cournapeau, D., Brucher, M., Perrot, M., and
Duchesnay, E. (2011). Scikit-learn: Machine learning
in Python. Journal of Machine Learning Research,
12:2825–2830.
Sharafaldin, I., Lashkari, A. H., and Ghorbani, A. A.
(2018). Toward generating a new intrusion detec-
tion dataset and intrusion traffic characterization. In
ICISSP, pages 108–116.
Shiravi, A., Shiravi, H., Tavallaee, M., and Ghorbani, A. A.
(2012). Toward developing a systematic approach to
generate benchmark datasets for intrusion detection.
computers & security, 31(3):357–374.
Tavallaee, M., Bagheri, E., Lu, W., and Ghorbani, A. A.
(2009). A detailed analysis of the kdd cup 99 data
set. In Computational Intelligence for Security and
Defense Applications, 2009. CISDA 2009. IEEE Sym-
posium on, pages 1–6. IEEE.
Wu, S. X. and Banzhaf, W. (2010). The use of computa-
tional intelligence in intrusion detection systems: A
review. Applied soft computing, 10(1):1–35.
Zinkevich, M. Rules of machine learning: Best practices
for ml engineering.
APPENDIX
5.1 Intel Core i5-4960 Full Results
Tables 5, 6 and 7 contain all results. Sim-
ilar tables with detailed results for the
execution times are available online at
https://gitlab.ilabt.imec.be/lpdhooge/cicids2017-
ml-graphics. Mirror tables and accompanying
graphics of testing on the Intel Xeon E5-2650v2 are
also available via the aforementioned link.
In-depth Comparative Evaluation of Supervised Machine Learning Approaches for Detection of Cybersecurity Threats
133