MetaSeal: Defending Against Image Attribution Forgery Through Content-Dependent Cryptographic Watermarks
Tong Zhou 1, Ruyi Ding 1, Gaowen Liu 2, Charles Fleming 2, Ramana Rao Kompella 2, Yunsi Fei 1, Xiaolin Xu 1, Shaolei Ren 3
Published on arXiv
2509.10766
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
MetaSeal effectively mitigates watermark forgery attempts on both natural and AI-generated images by combining content-dependent digital signatures with cryptographic verification
MetaSeal
Novel technique introduced
The rapid growth of digital and AI-generated images has amplified the need for secure and verifiable methods of image attribution. While digital watermarking offers more robust protection than metadata-based approaches--which can be easily stripped--current watermarking techniques remain vulnerable to forgery, creating risks of misattribution that can damage the reputations of AI model developers and the rights of digital artists. The vulnerabilities of digital watermarking arise from two key issues: (1) content-agnostic watermarks, which, once learned or leaked, can be transferred across images to fake attribution, and (2) reliance on detector-based verification, which is unreliable since detectors can be tricked. We present MetaSeal, a novel framework for content-dependent watermarking with cryptographic security guarantees to safeguard image attribution. Our design provides (1) \textbf{forgery resistance}, preventing unauthorized replication and enforcing cryptographic verification; (2) \textbf{robust self-contained protection}, embedding attribution directly into images while maintaining robustness against benign transformations; and (3) \textbf{evidence of tampering}, making malicious alterations visually detectable. Experiments demonstrate that MetaSeal effectively mitigates forgery attempts and applies to both natural and AI-generated images, establishing a new standard for secure image attribution. Code is available at: https://github.com/Tongzhou0101/MetaSeal.
Key Contributions
- Content-dependent watermarking scheme that cryptographically binds watermarks to image semantics, preventing transfer-based forgery attacks
- Cryptographic verification replacing vulnerable detector-based attribution, eliminating adversarial evasion of detectors
- Self-contained tamper-evidence mechanism that makes malicious alterations visually detectable while remaining robust to benign transformations (e.g., JPEG compression)
🛡️ Threat Analysis
MetaSeal embeds content-dependent watermarks into image outputs (not model weights) to establish verifiable attribution/provenance and resist forgery attacks where adversaries transfer watermarks across images or fool detectors via adversarial perturbations — core output integrity and content authentication concern.