Table 3: F1-Scores for all transformations.
with multiple transformations present, a CNN trained
on such data was able to properly label the transfor-
mations. Across all testing, in only one instance did
any of our models perform below a 90% F1-score and
most tests performed at > 95 F1-score. We conclude
that code visualization offers a potentially powerful
adversarial means for classifying program protection
types, without the need for reverse engineering or
symbolic execution. Use and expansion of this avenue
of analysis could greatly enhance applicability of the
technique to other avenues of metadata recovery at-
tacks and software analysis. In showing that image
analysis can be used to classify obfuscating transfor-
mations, we believe that there many directions that
this work can be taken in. One potential avenue is to
explore the granularity of our classification by test-
ing if image analysis can detect features of obfuscat-
ing transform deeper than simply the type. It is also
worth exploring if image analysis be used to label the
portions of the image that correspond to transformed
code. That capability would assist even further in re-
verse engineering and analysis. The perceived limi-
tation of file size could also be explored in this way,
as if sections of an image could be labeled as obfus-
cated instead of a whole image, this would allow the
use of sub image searching. This would allow large
programs to be broken into many smaller images, re-
quiring less intensive analysis.
ACKNOWLEDGMENTS
This work is supported in part by the National Sci-
ence Foundation under the CyberCorps Scholarship
for Service program, grant DGE-1564518.
REFERENCES
Albawi, S., Mohammed, T. A., and Al-Zawi, S. (2017).
Understanding of a convolutional neural network. In
ICET’17.
Banescu, S., Collberg, C., et al. (2016). Code obfuscation
against symbolic execution attacks. In ACSAC ’16.
Banescu, S. et al. (2017). Predicting the resilience of ob-
fuscated code against symbolic execution attacks via
machine learning. In USENIX SEC’17.
Bensaoud, A., Abudawaood, N., and Kalita, J. (2020). Clas-
sifying malware images with convolutional neural net-
work models. CoRR, abs/2010.16108.
Bishop, C. M. (2006). Pattern Recognition and Ma-
chine Learning (Information Science and Statistics).
Springer-Verlag, Berlin, Heidelberg.
Collberg, C. and Nagra, J. (2009). Surreptitious Software:
Obfuscation, Watermarking, and Tamperproofing for
Software Protection. Addison-Wesley Professional.
Coogan, K., Lu, G., and Debray, S. (2011). Deobfuscation
of virtualization-obfuscated software: a semantics-
based approach. In ACM CCS ’11.
Jones, L., Christman, D., Banescu, S., and Carlisle, M.
(2018). Bytewise: A case study in neural network
obfuscation identification. In CCWC ’18.
Junod, P., Rinaldini, J., Wehrli, J., and Michielin, J. (2015).
Obfuscator-llvm – software protection for the masses.
In SPRO ’15.
Kabanga, E. K. and Kim, C. H. (2017). Malware images
classification using convolutional neural network. J.
of Comp. and Com., 6(1).
Kalash, M. et al. (2018). Malware classification with deep
convolutional neural networks. In NTMS ’18.
Mallet, H. (2020). Malware classification using convolu-
tional neural networks.
Murphy, K. P. (2012). Machine Learning: A Probabilistic
Perspective. The MIT Press.
Nataraj, L., Karthikeyan, S., Jacob, G., and Manjunath,
B. S. (2011). Malware images: Visualization and au-
tomatic classification. In VizSec ’11.
Sainath, T. N., Mohamed, A.-r., Kingsbury, B., and Ram-
abhadran, B. (2013). Deep convolutional neural net-
works for lvcsr. In ICASSP ’13.
Salem, A. and Banescu, S. (2016). Metadata recovery
from obfuscated programs using machine learning. In
SSPREW ’16.
Schrittwieser, S., Katzenbeisser, S., et al. (2016). Protect-
ing software through obfuscation: Can it keep pace
with progress in code analysis? ACM Comput. Surv.,
49(1):4:1–4:37.
Seok, S. and Kim, H. (2016). Visualized malware classi-
fication based-on convolutional neural network. J. of
The Korea Inst. of Info. Sec. & Crypt., 26(1).
Sutskever, I. and others. (2014). Sequence to sequence
learning with neural networks. In NIPS ’14.
Tofighi-Shirazi, R., Asavoae, I. M., and Elbaz-Vincent, P.
(2019). Fine-grained static detection of obfuscation
transforms using ensemble-learning and semantic rea-
soning. In SSPREW ’19.
Tsoumakas, G. and Katakis, I. (2009). Multi-label classifi-
cation: An overview. Int. J. Data WH & Mining, 3.
Vasan, D. et al. (2020). Imcfn: Image-based malware classi-
fication using fine-tuned convolutional neural network
architecture. Computer Networks, 171.
Machine Learning Classification of Obfuscation using Image Visualization
859