Of-SemWat: High-payload text embedding for semantic watermarking of AI-generated images with arbitrary size
Benedetta Tondi , Andrea Costanzo , Mauro Barni
Published on arXiv
2509.24823
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Embedded semantic text is retrievable after a wide variety of image processing and AI inpainting, enabling detection of semantic modifications via image-text mismatch analysis in large-scale AI-generated images
OF-SemWat
Novel technique introduced
We propose a high-payload image watermarking method for textual embedding, where a semantic description of the image - which may also correspond to the input text prompt-, is embedded inside the image. In order to be able to robustly embed high payloads in large-scale images - such as those produced by modern AI generators - the proposed approach builds upon a traditional watermarking scheme that exploits orthogonal and turbo codes for improved robustness, and integrates frequency-domain embedding and perceptual masking techniques to enhance watermark imperceptibility. Experiments show that the proposed method is extremely robust against a wide variety of image processing, and the embedded text can be retrieved also after traditional and AI inpainting, permitting to unveil the semantic modification the image has undergone via image-text mismatch analysis.
Key Contributions
- Post-generation watermarking scheme (OF-SemWat) capable of embedding high-payload textual descriptions or input prompts into arbitrary-resolution images, including large-scale AI-generated images (1024×1024+)
- Integration of orthogonal and turbo codes with frequency-domain embedding and perceptual masking for robustness and imperceptibility
- Semantic modification detection via image-text mismatch analysis after watermark extraction, surviving both traditional and AI-based inpainting
🛡️ Threat Analysis
Proposes watermarking of AI-generated image OUTPUTS (not model weights) with semantic text to trace provenance, enable proactive deepfake detection, and expose semantic manipulations via image-text mismatch analysis — squarely an output integrity and content authentication contribution.