SHIFT: Stochastic Hidden-Trajectory Deflection for Removing Diffusion-based Watermark

Diffusion-based watermarking methods embed verifiable marks by manipulating the initial noise or the reverse diffusion trajectory. However, these methods share a critical assumption: verification can succeed only if the diffusion trajectory can be faithfully reconstructed. This reliance on trajectory recovery constitutes a fundamental and exploitable vulnerability. We propose $\underline{\mathbf{S}}$tochastic $\underline{\mathbf{Hi}}$dden-Trajectory De$\underline{\mathbf{f}}$lec$\underline{\mathbf{t}}$ion ($\mathbf{SHIFT}$), a training-free attack that exploits this common weakness across diverse watermarking paradigms. SHIFT leverages stochastic diffusion resampling to deflect the generative trajectory in latent space, making the reconstructed image statistically decoupled from the original watermark-embedded trajectory while preserving strong visual quality and semantic consistency. Extensive experiments on nine representative watermarking methods spanning noise-space, frequency-domain, and optimization-based paradigms show that SHIFT achieves 95%--100% attack success rates with nearly no loss in semantic quality, without requiring any watermark-specific knowledge or model retraining.

Key Contributions

Training-free watermark removal attack exploiting trajectory reconstruction dependency across diverse diffusion watermarking paradigms
Stochastic resampling technique that deflects generative trajectories while preserving semantic quality
Achieves 95-100% attack success rates against nine representative watermarking methods without watermark-specific knowledge

🛡️ Threat Analysis

Output Integrity Attack

This paper attacks content watermarking schemes embedded in diffusion model outputs. Watermark removal is a classic ML09 attack — it defeats output integrity/provenance verification mechanisms. The watermarks are embedded in generated images to verify authenticity, and SHIFT removes them while preserving visual quality.

Details

Domains

visiongenerative

Model Types

diffusion

Threat Tags

inference_timeblack_box

Applications

2025 0 cit.

Output Integrity Attack

92%