A Hard-Label Black-Box Evasion Attack against ML-based Malicious Traffic Detection Systems

Machine Learning (ML)-based malicious traffic detection is a promising security paradigm. It outperforms rule-based traditional detection by identifying various advanced attacks. However, the robustness of these ML models is largely unexplored, thereby allowing attackers to craft adversarial traffic examples that evade detection. Existing evasion attacks typically rely on overly restrictive conditions (e.g., encrypted protocols, Tor, or specialized setups), or require detailed prior knowledge of the target (e.g., training data and model parameters), which is impractical in realistic black-box scenarios. The feasibility of a hard-label black-box evasion attack (i.e., applicable across diverse tasks and protocols without internal target insights) thus remains an open challenge. To this end, we develop NetMasquerade, which leverages reinforcement learning (RL) to manipulate attack flows to mimic benign traffic and evade detection. Specifically, we establish a tailored pre-trained model called Traffic-BERT, utilizing a network-specialized tokenizer and an attention mechanism to extract diverse benign traffic patterns. Subsequently, we integrate Traffic-BERT into the RL framework, allowing NetMasquerade to effectively manipulate malicious packet sequences based on benign traffic patterns with minimal modifications. Experimental results demonstrate that NetMasquerade enables both brute-force and stealthy attacks to evade 6 existing detection methods under 80 attack scenarios, achieving over 96.65% attack success rate. Notably, it can evade the methods that are either empirically or certifiably robust against existing evasion attacks. Finally, NetMasquerade achieves low-latency adversarial traffic generation, demonstrating its practicality in real-world scenarios.

Key Contributions

Traffic-BERT: a network-specialized pre-trained transformer that captures diverse benign traffic distributions to enable realistic traffic mimicry for adversarial crafting
NetMasquerade: a hard-label black-box RL framework that iteratively selects packet modification actions guided by Traffic-BERT, requiring only a blocked-or-not signal from the target detector
Demonstrates 96.65%+ attack success rate against 6 ML-based traffic detectors across 80 attack scenarios, including methods claimed to be certifiably or empirically robust to existing evasion attacks

🛡️ Threat Analysis

Input Manipulation Attack

NetMasquerade crafts adversarial traffic examples at inference time to cause ML-based detection classifiers to misclassify malicious flows as benign — a direct evasion/input manipulation attack requiring no model internals, using RL in a hard-label black-box setting.

Details

Domains

timeseries

Model Types

transformerrl

Threat Tags

black_boxinference_timetargeted

Applications

2026 0 cit.

Input Manipulation Attack

57%