ZK-WAGON: Imperceptible Watermark for Image Generation Models using ZK-SNARKs
Aadarsh Anantha Ramakrishnan , Shubham Agarwal , Selvanayagam S , Kunwar Singh
Published on arXiv
2510.01967
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Demonstrates a verifiable, imperceptible watermarking pipeline for both GAN and diffusion models that proves image origin via ZK-SNARKs without leaking any model internals.
ZK-WAGON (SL-ZKCC)
Novel technique introduced
As image generation models grow increasingly powerful and accessible, concerns around authenticity, ownership, and misuse of synthetic media have become critical. The ability to generate lifelike images indistinguishable from real ones introduces risks such as misinformation, deepfakes, and intellectual property violations. Traditional watermarking methods either degrade image quality, are easily removed, or require access to confidential model internals - making them unsuitable for secure and scalable deployment. We are the first to introduce ZK-WAGON, a novel system for watermarking image generation models using the Zero-Knowledge Succinct Non Interactive Argument of Knowledge (ZK-SNARKs). Our approach enables verifiable proof of origin without exposing model weights, generation prompts, or any sensitive internal information. We propose Selective Layer ZK-Circuit Creation (SL-ZKCC), a method to selectively convert key layers of an image generation model into a circuit, reducing proof generation time significantly. Generated ZK-SNARK proofs are imperceptibly embedded into a generated image via Least Significant Bit (LSB) steganography. We demonstrate this system on both GAN and Diffusion models, providing a secure, model-agnostic pipeline for trustworthy AI image generation.
Key Contributions
- First system combining ZK-SNARKs with image watermarking, enabling verifiable proof of model origin without exposing model weights, prompts, or internal parameters
- Selective Layer ZK-Circuit Creation (SL-ZKCC), a method to convert only key model layers into ZK circuits, significantly reducing proof generation overhead
- End-to-end model-agnostic pipeline that imperceptibly embeds ZK-SNARK proofs into generated images via LSB steganography, demonstrated on both GANs and diffusion models
🛡️ Threat Analysis
The watermark (ZK-SNARK proof) is embedded into the IMAGE OUTPUT via steganography — not into model weights — with the explicit goal of content provenance verification for AI-generated images. This is squarely content watermarking to trace and authenticate model-generated outputs, directly addressing synthetic media authenticity and deepfake attribution concerns.