attack 2026

Adversarial Evasion in Non-Stationary Malware Detection: Minimizing Drift Signals through Similarity-Constrained Perturbations

Pawan Acharya , Lan Zhang

0 citations

α

Published on arXiv

2604.21310

Input Manipulation Attack

OWASP ML Top 10 — ML01

Model Skewing

OWASP ML Top 10 — ML08

Key Finding

Similarity constraints reduce output drift signals, with l2 regularization showing most promising results in the evasion-detectability trade-off

Similarity-Constrained Adversarial Perturbations

Novel technique introduced


Deep learning has emerged as a powerful approach for malware detection, demonstrating impressive accuracy across various data representations. However, these models face critical limitations in real-world, non-stationary environments where both malware characteristics and detection systems continuously evolve. Our research investigates a fundamental security question: Can an attacker generate adversarial malware samples that simultaneously evade classification and remain inconspicuous to drift monitoring mechanisms? We propose a novel approach that generates targeted adversarial examples in the classifier's standardized feature space, augmented with sophisticated similarity regularizers. By carefully constraining perturbations to maintain distributional similarity with clean malware, we create an optimization objective that balances targeted misclassification with drift signal minimization. We quantify the effectiveness of this approach by comprehensively comparing classifier output probabilities using multiple drift metrics. Our experiments demonstrate that similarity constraints can reduce output drift signals, with $\ell_2$ regularization showing the most promising results. We observe that perturbation budget significantly influences the evasion-detectability trade-off, with increased budget leading to higher attack success rates and more substantial drift indicators.


Key Contributions

  • Adversarial malware generation that simultaneously evades detection and minimizes drift signals
  • Similarity-constrained perturbation approach balancing evasion success with distributional similarity to genuine malware
  • Empirical analysis showing l2 regularization reduces output drift signals while maintaining attack effectiveness

🛡️ Threat Analysis

Input Manipulation Attack

Generates adversarial perturbations in feature space to cause misclassification of malware as benign at inference time — core adversarial evasion attack.

Model Skewing

Explicitly targets drift monitoring mechanisms in non-stationary environments, crafting adversarial samples that remain inconspicuous to drift detection — this is model skewing through drift exploitation.


Details

Domains
tabular
Model Types
traditional_ml
Threat Tags
white_boxinference_timetargeted
Applications
malware detectioncybersecurity