SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures
Yi Qin 1,2, Rui Wang 3,1,4,2, Tao Huang 1,2, Tong Xiao 1,2, Liping Jing 3,1,2
2 National Engineering Research Center of Rail Transportation Operation and Control System
3 State Key Laboratory of Advanced Rail Autonomous Operation
Published on arXiv
2508.06127
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
VeSCA improves adversarial transferability by 12.7% over state-of-the-art methods across three downstream model categories and five domain-specific datasets.
VeSCA (Vertex-Refining Simplicial Complex Attack)
Novel technique introduced
While the Segment Anything Model (SAM) transforms interactive segmentation with zero-shot abilities, its inherent vulnerabilities present a single-point risk, potentially leading to the failure of numerous downstream applications. Proactively evaluating these transferable vulnerabilities is thus imperative. Prior adversarial attacks on SAM often present limited transferability due to insufficient exploration of common weakness across domains. To address this, we propose Vertex-Refining Simplicial Complex Attack (VeSCA), a novel method that leverages only the encoder of SAM for generating transferable adversarial examples. Specifically, it achieves this by explicitly characterizing the shared vulnerable regions between SAM and downstream models through a parametric simplicial complex. Our goal is to identify such complexes within adversarially potent regions by iterative vertex-wise refinement. A lightweight domain re-adaptation strategy is introduced to bridge domain divergence using minimal reference data during the initialization of simplicial complex. Ultimately, VeSCA generates consistently transferable adversarial examples through random simplicial complex sampling. Extensive experiments demonstrate that VeSCA achieves performance improved by 12.7% compared to state-of-the-art methods across three downstream model categories across five domain-specific datasets. Our findings further highlight the downstream model risks posed by SAM's vulnerabilities and emphasize the urgency of developing more robust foundation models.
Key Contributions
- VeSCA: a transferable adversarial attack that uses only SAM's encoder by modeling shared vulnerable regions between SAM and downstream models via a parametric simplicial complex
- Iterative vertex-wise refinement procedure to locate adversarially potent simplicial complex regions across domains
- Lightweight domain re-adaptation strategy that bridges domain divergence using minimal reference data to initialize the simplicial complex
🛡️ Threat Analysis
VeSCA crafts adversarial perturbations at inference time using SAM's encoder gradients, targeting shared vulnerable regions to maximize transferability to black-box downstream models — a textbook gradient-based adversarial example attack.