WaterVIB: Learning Minimal Sufficient Watermark Representations via Variational Information Bottleneck

Robust watermarking is critical for intellectual property protection, whereas existing methods face a severe vulnerability against regeneration-based AIGC attacks. We identify that existing methods fail because they entangle the watermark with high-frequency cover texture, which is susceptible to being rewritten during generative purification. To address this, we propose WaterVIB, a theoretically grounded framework that reformulates the encoder as an information sieve via the Variational Information Bottleneck. Instead of overfitting to fragile cover details, our approach forces the model to learn a Minimal Sufficient Statistic of the message. This effectively filters out redundant cover nuances prone to generative shifts, retaining only the essential signal invariant to regeneration. We theoretically prove that optimizing this bottleneck is a necessary condition for robustness against distribution-shifting attacks. Extensive experiments demonstrate that WaterVIB significantly outperforms state-of-the-art methods, achieving superior zero-shot resilience against unknown diffusion-based editing.

Key Contributions

WaterVIB framework that reformulates the watermark encoder as an information sieve using the Variational Information Bottleneck, forcing it to learn a Minimal Sufficient Statistic of the message rather than entangling it with fragile high-frequency cover texture
Theoretical proof that optimizing the VIB bottleneck is a necessary condition for robustness against distribution-shifting (regeneration-based) attacks
Zero-shot resilience against unknown diffusion-based editing, outperforming state-of-the-art watermarking methods across six AIGC editing benchmarks

🛡️ Threat Analysis

Output Integrity Attack

WaterVIB is a content watermarking defense: it embeds watermarks in image outputs for copyright protection and provenance, and defends specifically against regeneration-based AIGC attacks (diffusion models) that erase those watermarks — classic output integrity / watermark robustness against removal attacks.

Details

Domains

visiongenerative

Model Types

diffusion

Threat Tags

black_boxinference_time

Datasets

six AIGC editing benchmarks (unspecified in excerpt)

Applications

2025 0 cit.

Output Integrity Attack

92%