defense 2025

ZK-WAGON: Imperceptible Watermark for Image Generation Models using ZK-SNARKs

Aadarsh Anantha Ramakrishnan , Shubham Agarwal , Selvanayagam S , Kunwar Singh

0 citations · 18 references · International Conference on AI...

α

Published on arXiv

2510.01967

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Demonstrates a verifiable, imperceptible watermarking pipeline for both GAN and diffusion models that proves image origin via ZK-SNARKs without leaking any model internals.

ZK-WAGON (SL-ZKCC)

Novel technique introduced


As image generation models grow increasingly powerful and accessible, concerns around authenticity, ownership, and misuse of synthetic media have become critical. The ability to generate lifelike images indistinguishable from real ones introduces risks such as misinformation, deepfakes, and intellectual property violations. Traditional watermarking methods either degrade image quality, are easily removed, or require access to confidential model internals - making them unsuitable for secure and scalable deployment. We are the first to introduce ZK-WAGON, a novel system for watermarking image generation models using the Zero-Knowledge Succinct Non Interactive Argument of Knowledge (ZK-SNARKs). Our approach enables verifiable proof of origin without exposing model weights, generation prompts, or any sensitive internal information. We propose Selective Layer ZK-Circuit Creation (SL-ZKCC), a method to selectively convert key layers of an image generation model into a circuit, reducing proof generation time significantly. Generated ZK-SNARK proofs are imperceptibly embedded into a generated image via Least Significant Bit (LSB) steganography. We demonstrate this system on both GAN and Diffusion models, providing a secure, model-agnostic pipeline for trustworthy AI image generation.


Key Contributions

  • First system combining ZK-SNARKs with image watermarking, enabling verifiable proof of model origin without exposing model weights, prompts, or internal parameters
  • Selective Layer ZK-Circuit Creation (SL-ZKCC), a method to convert only key model layers into ZK circuits, significantly reducing proof generation overhead
  • End-to-end model-agnostic pipeline that imperceptibly embeds ZK-SNARK proofs into generated images via LSB steganography, demonstrated on both GANs and diffusion models

🛡️ Threat Analysis

Output Integrity Attack

The watermark (ZK-SNARK proof) is embedded into the IMAGE OUTPUT via steganography — not into model weights — with the explicit goal of content provenance verification for AI-generated images. This is squarely content watermarking to trace and authenticate model-generated outputs, directly addressing synthetic media authenticity and deepfake attribution concerns.


Details

Domains
visiongenerative
Model Types
diffusiongan
Threat Tags
digitalinference_time
Applications
ai-generated image authenticationsynthetic media provenancedeepfake attribution