a better normal traffic detection
23
. By using two dif-
ferent methods we obtain similar results (with better
precison in Cisco paper). This is probably due to the
information added by TLS metadata.
One possibility of further work would be to extend
our method by using TLS metadata in order to quan-
tify the information given by it. It would be also in-
teresting to test both method with the two datasets :
ours and Cisco’s one.
5.3 Malwares Do Not Respect Rules
The previous section show that malwares do not be-
have in the same way than normal process.
For HTTP traffics the header used are not the same
and for headers shared by both type of traffic, the
average frequency of using the headers is comple-
tely different. Normally headers are used to give in-
formation to web sites in order to make them adapt
correctly the content for the client. But malwares
know in advance, at least potentially, the characte-
ristics of the C&C server they connect to. Malwa-
res when they communicate with custom C&C server,
do not communicate with website, hackers do not use
this kind of headers probably because hackers alre-
ady know, when programming malwares, which and
how the information will be send. It is the same for
HTTPS when TLS version or cipher mode are not pre-
cise. However for HTTPS it is even more surprising
as malwares use mostly deprecated protocols or do
not precise the encryption method and version as it is
mandatory contrary to some HTTP headers. This is
really easy to detect while observing network traffic
packets. Because of that, an easy way to protect a sy-
stem is to send and alert when deprecated protocol are
used
24
and set a white list for old programs that can-
not be changed and that use this kind of deprecated,
and unsafe protocols.
However, this elements that can be used to detect mal-
wares based on traffic analysis can be modified, es-
pecially for HTTPS. If hacker modified the way of
constructing malwares by using classic cipher mode
and TLS version for HTTPS and using more classic
HTTP header, it will probably be necessary to use ot-
her features to detect malwares by this method. The
following step would be to find features not based on
the cipher mode used or easy modifiable header va-
lues and that describe more precisely the behaviour
of malwares.
23
But difficult to know exactly as we do not know what
is the value, probably precision only, and the recall is not
precised.
24
As using deprecated protocol for cryptography it is re-
ally not recommended
6 CONCLUSION
Infected machine based on the analysis of the net-
work traffic generated is relatively efficient. With
more than 90% of precision and recall for both HTTP
and HTTPS, with only few features. This is due to
the fact that malwares are made to be only malwares
and hackers already fixed how the data will be sent
both by the malware and the C&C server. And also
for some unknow reason they use deprecated proto-
col for SSL communication. However a modification
of the packet sent could lead to the impossibility of
detection by this method. There are different possi-
bilities for future work. The first one is to find anot-
her features in the case of modification of the network
behaviour by the hacker as explained before. The se-
cond is link to real environment implementation. The
use for real time detection was not tested during this
work. It would be of great use to see if it can be used
in real time
25
and especially if it can detect malware
that were not detected by classical antivirus. This
method should be used complementary to other pro-
tection method. For further research, an analysis of
the cipher suite used and the reason of why the data
of TLS cipher suite and TLS version is not given du-
ring malware communication should be study.
REFERENCES
http://www.av-comparatives.org/wp-content/uploads/2014/
04/avc fdt 201403 en.pdf.
Detecting encrypted malware traffic (without decryp-
tion). https://blogs.cisco.com/security/detecting-
encrypted-malware-traffic-without-decryption.
Ichino, M., Kawamoto, K., Iwano, T., Hatada, M., and Yos-
hiura, H. (2015). Evaluating header information fea-
tures for malware infection detection. Journal of In-
formation Processing, 23(5):603–612.
Maloof, M. A. Machine Learning and Data Mining for
Computer Security : Methods and Application. Lon-
don Spring.
Nasi, E. (2014). Bypass antivirus dynamic analysis. http://
packetstorm.foofus.com/papers/virus/BypassAV Dy-
namics.pdf.
Ogawa, H., Yamaguchi, Y., Shimada, H., Takakura, H.,
Akiyama, M., and Yagi, T. Malware originated
http traffic detection utilizing cluster appearance ratio.
ICOIN 2017.
Otsuki, Y., Ichino, M., Kimura, S., Hatada, M., and Yoshi-
ura, H. (2014). Evaluating payload features for mal-
ware infection detection. Journal of Information Pro-
cessing, 22(2):376–387.
Page, C. R. E. (2003). Anti-debugging & software pro-
tection advice.
25
and real environment not virtual machine
Malware Detection based on HTTPS Characteristic via Machine Learning
417