Towards Robust Content Watermarking Against Removal and Forgery Attacks

Generated contents have raised serious concerns about copyright protection, image provenance, and credit attribution. A potential solution for these problems is watermarking. Recently, content watermarking for text-to-image diffusion models has been studied extensively for its effective detection utility and robustness. However, these watermarking techniques are vulnerable to potential adversarial attacks, such as removal attacks and forgery attacks. In this paper, we build a novel watermarking paradigm called Instance-Specific watermarking with Two-Sided detection (ISTS) to resist removal and forgery attacks. Specifically, we introduce a strategy that dynamically controls the injection time and watermarking patterns based on the semantics of users' prompts. Furthermore, we propose a new two-sided detection approach to enhance robustness in watermark detection. Experiments have demonstrated the superiority of our watermarking against removal and forgery attacks.

Key Contributions

Instance-Specific watermarking with Two-Sided detection (ISTS) paradigm that dynamically controls injection time and patterns based on prompt semantics
Two-sided detection approach enhancing robustness against removal and forgery attacks
Demonstrated superior robustness against adversarial watermark manipulation attacks

🛡️ Threat Analysis

Output Integrity Attack

Proposes content watermarking for diffusion-generated images to protect provenance and resist watermark removal/forgery attacks — this is output integrity and content authentication.