evaluation, even if the second dataset has been
collected under very similar circumstances.
ACKNOWLEDGEMENTS
This work has partially been supported by the AIDE
project which is a three-way partnership between gov-
ernment, academia and industry. It aims to achieve
robust, resilient and adaptive protection of computer
systems with a particular role for federated learning.
The AIDE project is funded by the Belgian Chan-
cellery of the Prime Minister, a federal public service,
as part of their financing for the development of arti-
ficial intelligence.
REFERENCES
Ab Razak, M. F., Anuar, N. B., Salleh, R., and Firdaus, A.
(2016). The rise of “malware”: Bibliometric analysis
of malware study. Journal of Network and Computer
Applications, 75:58–76.
Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J.,
and Ahmad, F. (2021). Network intrusion detection
system: A systematic study of machine learning and
deep learning approaches. Transactions on Emerging
Telecommunications Technologies, 32(1):e4150.
Aldweesh, A., Derhab, A., and Emam, A. Z. (2020). Deep
learning approaches for anomaly-based intrusion de-
tection systems: A survey, taxonomy, and open issues.
Knowledge-Based Systems, 189:105124.
Berman, D. S., Buczak, A. L., Chavis, J. S., and Corbett,
C. L. (2019). A survey of deep learning methods for
cyber security. Information, 10(4):122.
Buczak, A. L. and Guven, E. (2015). A survey of data min-
ing and machine learning methods for cyber security
intrusion detection. IEEE Communications surveys &
tutorials, 18(2):1153–1176.
Carrier, T., Victor, P., Tekeoglu, A., and Lashkari, A. H.
(2022). Detecting obfuscated malware using memory
feature engineering. In ICISSP, pages 177–188.
Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., and
Villano, U. (2021). Usb-ids-1: a public multilayer
dataset of labeled network flows for ids evaluation.
In 2021 51st Annual IEEE/IFIP International Con-
ference on Dependable Systems and Networks Work-
shops (DSN-W), pages 1–6. IEEE.
Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y.,
Cho, H., Chen, K., et al. (2015). Xgboost: extreme
gradient boosting. R package version 0.4-2, 1(4):1–4.
D’hooge, L., Verkerken, M., Volckaert, B., Wauters, T., and
De Turck, F. (2022a). Establishing the contaminat-
ing effect of metadata feature inclusion in machine-
learned network intrusion detection models. In Inter-
national Conference on Detection of Intrusions and
Malware, and Vulnerability Assessment, pages 23–41.
Springer.
D’hooge, L., Verkerken, M., Wauters, T., Volckaert, B.,
and De Turck, F. (2021). Hierarchical feature block
ranking for data-efficient intrusion detection model-
ing. Computer Networks, 201:108613.
D’hooge, L., Verkerken, M., Wauters, T., Volckaert, B., and
De Turck, F. (2022b). Discovering non-metadata con-
taminant features in intrusion detection datasets. In
2022 19th Annual International Conference on Pri-
vacy, Security & Trust (PST), pages 1–11. Ieee.
Engelen, G., Rimmer, V., and Joosen, W. (2021). Trou-
bleshooting an intrusion detection dataset: the ci-
cids2017 case study. In 2021 IEEE Security and Pri-
vacy Workshops (SPW), pages 7–12. IEEE.
Garc
´
ıa, S., Grill, M., Stiborek, J., and Zunino, A. (2014).
An empirical comparison of botnet detection methods.
Computers & Security, 45:100–123.
Geurts, P., Ernst, D., and Wehenkel, L. (2006). Extremely
randomized trees. Machine learning, 63(1):3–42.
Habibi Lashkari, A., Kaur, G., and Rahali, A. (2020). Di-
darknet: a contemporary approach to detect and char-
acterize the darknet traffic using deep image learning.
In 2020 the 10th International Conference on Com-
munication and Network Security, pages 1–13.
Holte, R. C. (1993). Very simple classification rules per-
form well on most commonly used datasets. Machine
learning, 11(1):63–90.
Issakhani, M., Victor, P., Tekeoglu, A., and Lashkari, A.
(2022). Pdf malware detection based on stacking
learning. In Proceedings of the 8th International
Conference on Information Systems Security and Pri-
vacy - Volume 1: ICISSP,, pages 562–570. INSTICC,
SciTePress.
Jazi, H. H., Gonzalez, H., Stakhanova, N., and Ghorbani,
A. A. (2017). Detecting http-based application layer
dos attacks on web servers in the presence of sam-
pling. Computer Networks, 121:25–36.
Keyes, D. S., Li, B., Kaur, G., Lashkari, A. H., Gagnon, F.,
and Massicotte, F. (2021). Entroplyzer: Android mal-
ware classification and characterization using entropy
analysis of dynamic characteristics. In 2021 Recon-
ciling Data Analytics, Automation, Privacy, and Se-
curity: A Big Data Challenge (RDAAPS), pages 1–12.
IEEE.
Liu, H. and Lang, B. (2019). Machine learning and deep
learning methods for intrusion detection systems: A
survey. applied sciences, 9(20):4396.
Mahdavifar, S., Hanafy Salem, A., Victor, P., Razavi, A. H.,
Garzon, M., Hellberg, N., and Lashkari, A. H. (2021).
Lightweight hybrid detection of data exfiltration using
dns based on machine learning. In 2021 the 11th Inter-
national Conference on Communication and Network
Security, pages 80–86.
Mishra, P., Varadharajan, V., Tupakula, U., and Pilli, E. S.
(2018). A detailed investigation and analysis of us-
ing machine learning techniques for intrusion de-
tection. IEEE communications surveys & tutorials,
21(1):686–728.
MontazeriShatoori, M., Davidson, L., Kaur, G., and
Lashkari, A. H. (2020). Detection of doh tun-
nels using time-series classification of encrypted
Castles Built on Sand: Observations from Classifying Academic Cybersecurity Datasets with Minimalist Methods
71