Authors:
Helene Orsini
1
and
Yufei Han
2
Affiliations:
1
Inria, Univ. Rennes, IRISA, Rennes, France
;
2
CentraleSupelec, Univ. Rennes, IRISA, Rennes, France
Keyword(s):
Campaign Attribution, Unseen Campaign Detection, Density-Aware Active Learning.
Abstract:
Network attack attribution is crucial for identifying and understanding attack campaigns, and implementing preemptive measures. Traditional machine learning approaches face challenges such as labor-intensive campaign annotation, imbalanced attack data distribution, and concept drift. To address these challenges, we propose DYNAMO, a novel weakly supervised and human-in-the-loop machine learning framework for automated network attack attribution using raw network traffic records. DYNAMO integrates self-supervised learning and density-aware active learning techniques to reduce the overhead of exhaustive annotation, querying human analysts to label only a few selected highly representative network traffic samples. Our experiments on the CTU-13 dataset demonstrate that annotating less than 3% of the records achieves attribution accuracy comparable to fully supervised approaches with twice as many labeled records. Moreover, compared to classic active learning and semi-supervised technique
s, DYNAMO achieves 20% higher attribution accuracy and nearly perfect detection accuracy for unknown botnet campaigns with minimal annotations.
(More)