Authors:
Valentin Kiechle
1
;
Matthias Börsig
2
;
Sven Nitzsche
2
;
Ingmar Baumgart
2
and
Jürgen Becker
3
Affiliations:
1
AMAI GmbH - AI Experts, Karlsruhe, Germany
;
2
FZI Research Center for Information Technology, Karlsruhe, Germany
;
3
Institute for Information Processing Technology (ITIV), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Keyword(s):
Protocol Reverse Engineering, Artificial Intelligence, Machine Learning, Neural Networks, Fuzzing.
Abstract:
The ability of neural networks to universally approximate any function enables them to learn relationships between arbitrary kinds of data. This offers great potential in information security topics such as protocol reverse engineering (PRE), which has seen little usage of neural networks (NNs) so far. In this paper, we provide a novel approach for implementing PRE with solely NNs, demonstrating a simple yet effective reverse engineering of text-based protocols. This approach is modular by design and allows for the exchange of neural network models at any step with better performing models. The architectures used include a convolutional neural network (CNN), an autoencoder (AE), a generative adversarial net (GAN), a long short-term memory (LSTM), and a self-organizing map (SOM). All of these models combine for a new protocol reverse engineering approach. The results show that the widespread application layer protocols HTTP and FTP can successfully be mimicked by artificial intelligen
ce, thereby paving the way for use cases such as fuzzing. A direct comparison to other PRE approaches is not possible due to the black-box nature of neural networks and represents the main limitation of our work. Our experiments showed that this multi-model approach yield up to 19% better message clustering, improved context distribution, and proving LSTM to be the best candidate for generating new messages with up to 67.6% valid HTTP packages and 100% valid FTP packages.
(More)