defense 2026

SAiW: Source-Attributable Invisible Watermarking for Proactive Deepfake Defense

0 citations

Published on arXiv

2603.23178

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves high perceptual quality and strong robustness against compression, filtering, geometric transformations, and adversarial perturbations across multiple deepfake datasets

SAiW

Novel technique introduced

Deepfakes generated by modern generative models pose a serious threat to information integrity, digital identity, and public trust. Existing detection methods are largely reactive, attempting to identify manipulations after they occur and often failing to generalize across evolving generation techniques. This motivates the need for proactive mechanisms that secure media authenticity at the time of creation. In this work, we introduce SAiW, a Source-Attributed Invisible watermarking Framework for proactive deepfake defense and media provenance verification. Unlike conventional watermarking methods that treat watermark payloads as generic signals, SAiW formulates watermark embedding as a source-conditioned representation learning problem, where watermark identity encodes the originating source and modulates the embedding process to produce discriminative and traceable signatures. The framework integrates feature-wise linear modulation to inject source identity into the embedding network, enabling scalable multi-source watermark generation. A perceptual guidance module derived from human visual system priors ensures that watermark perturbations remain visually imperceptible while maintaining robustness. In addition, a dual-purpose forensic decoder simultaneously reconstructs the embedded watermark and performs source attribution, providing both automated verification and interpretable forensic evidence. Extensive experiments across multiple deepfake datasets demonstrate that SAiW achieves high perceptual quality while maintaining strong robustness against compression, filtering, noise, geometric transformations, and adversarial perturbations. By binding digital media to its origin through invisible yet verifiable markers, SAiW enables reliable authentication and source attribution, providing a scalable foundation for proactive deepfake defense and trustworthy media provenance.

Key Contributions

Source-conditioned watermark embedding using feature-wise linear modulation to encode originating source identity
Perceptual guidance module ensuring imperceptibility while maintaining robustness against compression, filtering, noise, and adversarial perturbations
Dual-purpose forensic decoder for simultaneous watermark reconstruction and source attribution

🛡️ Threat Analysis

Output Integrity Attack

Watermarks model-generated content (deepfakes) to verify provenance and authenticate media outputs — this is content watermarking for output integrity and deepfake detection, not model IP protection.

Details

Domains

visiongenerative

Model Types

gandiffusion

Threat Tags

inference_time

Applications

deepfake detectionmedia provenancecontent authentication

Read PDF arXiv Code

SAiW: Source-Attributable Invisible Watermarking for Proactive Deepfake Defense

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Luminark: Training-free, Probabilistically-Certified Watermarking for General Vision Generative Models

R$^2$BD: A Reconstruction-Based Method for Generalizable and Efficient Detection of Fake Images

CIPHER: Counterfeit Image Pattern High-level Examination via Representation

Multi-Feature Fusion Approach for Generative AI Images Detection

Causal Fingerprints of AI Generative Models

Moiré Video Authentication: A Physical Signature Against AI Video Generation

Beyond Artifacts: Real-Centric Envelope Modeling for Reliable AI-Generated Image Detection

Efficient Zero-Shot AI-Generated Image Detection