
Figure 11: Visualization of the latent space embedding
of drifting and poisoned points using CADE (Yang et al.,
2021). Drifting and Poisoned samples are distributed
through the entire latent space suggesting that distance met-
ric based detection schemes will not be able to accurately
classify samples.
Table 2: The four poisoned models we evaluated our tech-
nique against. Each model was poisoned with a trigger size
of 30 at a 2% poisoning rate, and the held out data consisted
of either any sample from a malware family with less than
or equal to 10 samples, or 1 sample (singleton).
Attack Type Held-out Test ASR
Data Accuracy
MinPopulation ≤ 10 98.38% 69.38%
MinPopulation Singleton 99.98% 59.31%
CountAbsSHAP ≤ 10 99.09% 67.16%
CountAbsSHAP Singleton 99.99% 73.25%
characteristics. Observing the distinct shift in logit
distributions at various magnitudes of noise allow us
to identify which samples are likely backdoored or
drifting.
In the network traffic modality, we were able to
correctly identify poisoned and drifting samples from
held out classes. Extending the approach to the An-
droid modality using the AndroZoo dataset, we show
that the approach is generalizable to new datasets and
can identify drifting samples from malware classes.
The deployment of neural network models in real-
world dynamic systems must take into account the
threat of a potential adversary while maintaining per-
formance on an ever-changing distribution of sam-
ples, and our approach provides an effective method
to differentiate between these two types of samples.
REFERENCES
Abaid, Z., Kaafar, M. A., and Jha, S. (2017). Quantifying
the impact of adversarial evasion attacks on machine
learning based android malware classifiers. In 2017
IEEE 16th international symposium on network com-
puting and applications (NCA), pages 1–10. IEEE.
Allix, K., Bissyand
´
e, T. F., Klein, J., and Le Traon, Y.
(2016). Androzoo: Collecting millions of android
apps for the research community. In Proceedings of
the 13th International Conference on Mining Software
Repositories, MSR ’16, pages 468–471, New York,
NY, USA. ACM.
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., and
Rieck, K. (2014). Drebin: Effective and explainable
detection of android malware in your pocket. In Net-
work and Distributed System Security Symposium.
Bu, L., Alippi, C., and Zhao, D. (2016). A pdf-free change
detection test based on density difference estimation.
IEEE Trans. Neural Networks Learn. Syst., PP(99):1–
11.
Carlini, N. and Wagner, D. (2017). Towards evaluating the
robustness of neural networks. In 2017 ieee sympo-
sium on security and privacy (sp), pages 39–57. IEEE.
Corona, I., Giacinto, G., and Roli, F. (2013). Adversar-
ial attacks against intrusion detection systems: Taxon-
omy, solutions and open issues. Information Sciences,
239:201–225.
Dasu, T., Krishnan, S., Venkatasubramanian, S., and Yi,
K. (2006). An information-theoretic approach to de-
tecting changes in multi-dimensional data streams. In
Proc. Symposium on the Interface of Statistics, Com-
puting Science, and Applications (Interface).
Gama, J., Medas, P., Castillo, G., and Rodrigues, P. (2004).
Learning with drift detection. In Proc. 17th Brazilian
Symp. Artificial Intelligence, pages 286–295.
Gu, T., Dolan-Gavitt, B., and Garg, S. (2017). Bad-
nets: Identifying vulnerabilities in the machine
learning model supply chain. arXiv preprint
arXiv:1708.06733.
Ho, S., Reddy, A., Venkatesan, S., Izmailov, R., Chadha,
R., and Oprea, A. (2022). Data sanitization approach
to mitigate clean-label attacks against malware detec-
tion systems. In MILCOM 2022 - 2022 IEEE Military
Communications Conference (MILCOM), pages 993–
998.
Hurier, M., Suarez-Tangil, G., Dash, S. K., Bissyand
´
e, T. F.,
Traon, Y. L., Klein, J., and Cavallaro, L. (2017). Eu-
phony: harmonious unification of cacophonous anti-
virus vendor labels for android malware. In Proceed-
ings of the 14th International Conference on Mining
Software Repositories, pages 425–435. IEEE Press.
Izmailov, R., Lin, P., Venkatesan, S., and Sugrim, S. (2021).
Combinatorial boosting of classifiers for moving tar-
get defense against adversarial evasion attacks. In
Proceedings of the 8th ACM Workshop on Moving
Target Defense, pages 13–21.
Korycki, L. (2022). Adversarial concept drift detection un-
der poisoning attacks for robust data stream mining.
Machine Learning, 112:4013–4048.
Lu, D. Traffic dataset USTC-TFC2016.
Nishida, K. and Yamauchi, K. (2007). Detecting concept
drift using statistical testing. In Corruble, V., Takeda,
M., and Suzuki, E., editors, Proc. 10th Int. Conf. Dis-
covery Science, pages 264–269. Springer Berlin Hei-
delberg.
Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J., and
Cavallaro, L. (2019). TESSERACT: Eliminating ex-
Backdoor Attacks During Retraining of Machine Learning Models: A Mitigation Approach
149