Vanish into Thin Air: Cross-prompt Universal Adversarial Attacks for SAM2

Recent studies reveal the vulnerability of the image segmentation foundation model SAM to adversarial examples. Its successor, SAM2, has attracted significant attention due to its strong generalization capability in video segmentation. However, its robustness remains unexplored, and it is unclear whether existing attacks on SAM can be directly transferred to SAM2. In this paper, we first analyze the performance gap of existing attacks between SAM and SAM2 and highlight two key challenges arising from their architectural differences: directional guidance from the prompt and semantic entanglement across consecutive frames. To address these issues, we propose UAP-SAM2, the first cross-prompt universal adversarial attack against SAM2 driven by dual semantic deviation. For cross-prompt transferability, we begin by designing a target-scanning strategy that divides each frame into k regions, each randomly assigned a prompt, to reduce prompt dependency during optimization. For effectiveness, we design a dual semantic deviation framework that optimizes a UAP by distorting the semantics within the current frame and disrupting the semantic consistency across consecutive frames. Extensive experiments on six datasets across two segmentation tasks demonstrate the effectiveness of the proposed method for SAM2. The comparative results show that UAP-SAM2 significantly outperforms state-of-the-art (SOTA) attacks by a large margin.

Key Contributions

First analysis of the architectural gap between SAM and SAM2 that limits direct attack transfer, identifying prompt directional guidance and cross-frame semantic entanglement as key challenges.
Target-scanning strategy that partitions frames into k randomly-prompted regions to reduce prompt dependency and enable cross-prompt transferability of the UAP.
Dual semantic deviation framework that simultaneously distorts within-frame semantics and disrupts cross-frame semantic consistency to attack SAM2's video segmentation.

🛡️ Threat Analysis

Input Manipulation Attack

Proposes gradient-optimized universal adversarial perturbations (UAPs) applied at inference time to cause segmentation failure in SAM2 — a canonical input manipulation attack targeting a vision foundation model.

Details

Domains

vision

Model Types

transformer

Threat Tags

white_boxinference_timeuntargeteddigital

Datasets

SA-1BDAVISYouTube-VOSMOSECOCOPASCAL VOC

Applications

2025 0 cit.

Input Manipulation Attack

92%