attack 2025

The Eminence in Shadow: Exploiting Feature Boundary Ambiguity for Robust Backdoor Attacks

Zhou Feng ¹, Jiahao Chen ¹, Chunyi Zhou ¹, Yuwen Pu ², Tianyu Du ¹, Jinbao Li ³, Jianhai Chen ¹, Shouling Ji ¹

¹ Zhejiang University

² Chongqing University

³ Qilu University of Technology

0 citations · 57 references · arXiv

Published on arXiv

2512.10402

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

Eminence achieves ≥90% attack success rate with ≤0.01% poison rate, compared to SOTA methods requiring ≥1%, confirming an exponential relationship between margin poisoning and adversarial boundary manipulation.

Eminence

Novel technique introduced

Deep neural networks (DNNs) underpin critical applications yet remain vulnerable to backdoor attacks, typically reliant on heuristic brute-force methods. Despite significant empirical advancements in backdoor research, the lack of rigorous theoretical analysis limits understanding of underlying mechanisms, constraining attack predictability and adaptability. Therefore, we provide a theoretical analysis targeting backdoor attacks, focusing on how sparse decision boundaries enable disproportionate model manipulation. Based on this finding, we derive a closed-form, ambiguous boundary region, wherein negligible relabeled samples induce substantial misclassification. Influence function analysis further quantifies significant parameter shifts caused by these margin samples, with minimal impact on clean accuracy, formally grounding why such low poison rates suffice for efficacious attacks. Leveraging these insights, we propose Eminence, an explainable and robust black-box backdoor framework with provable theoretical guarantees and inherent stealth properties. Eminence optimizes a universal, visually subtle trigger that strategically exploits vulnerable decision boundaries and effectively achieves robust misclassification with exceptionally low poison rates (< 0.1%, compared to SOTA methods typically requiring > 1%). Comprehensive experiments validate our theoretical discussions and demonstrate the effectiveness of Eminence, confirming an exponential relationship between margin poisoning and adversarial boundary manipulation. Eminence maintains > 90% attack success rate, exhibits negligible clean-accuracy loss, and demonstrates high transferability across diverse models, datasets and scenarios.

Key Contributions

Theoretical analysis deriving a closed-form 'ambiguous boundary region' showing how sparse decision boundaries allow negligible relabeled samples to induce substantial misclassification
Influence function analysis formally quantifying why extremely low poison rates (<0.01%) are sufficient for effective backdoor attacks with minimal clean-accuracy degradation
Eminence: a black-box backdoor framework with provable guarantees that optimizes universal, visually subtle triggers achieving ≥90% ASR — 100x lower poison rate than SOTA methods

🛡️ Threat Analysis

Model Poisoning

Eminence is a backdoor/trojan attack that embeds hidden, targeted misclassification behavior activated by a visually subtle trigger, using influence functions and boundary analysis to inject backdoors at exceptionally low poison rates (<0.01%) while preserving clean accuracy.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

black_boxtraining_timetargeteddigital

Datasets

CIFAR-10GTSRBImageNet

Applications

image classification

Read PDF arXiv DOI Code

The Eminence in Shadow: Exploiting Feature Boundary Ambiguity for Robust Backdoor Attacks

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

STONE: Pioneering the One-to-N Universal Backdoor Threat in 3D Point Cloud

Injection, Attack and Erasure: Revocable Backdoor Attacks via Machine Unlearning

DSBA: Dynamic Stealthy Backdoor Attack with Collaborative Optimization in Self-Supervised Learning

Backdoor Poisoning Attack Against Face Spoofing Attack Detection Methods

Hardware-Triggered Backdoors

CIS-BA: Continuous Interaction Space Based Backdoor Attack for Object Detection in the Real-World

The Double-Edged Sword of Data-Driven Super-Resolution: Adversarial Super-Resolution Models

Silent Until Sparse: Backdoor Attacks on Semi-Structured Sparsity