defense 2026

High-Fidelity Face Content Recovery via Tamper-Resilient Versatile Watermarking

Peipeng Yu 1, Jinfeng Xie 1, Chengfu Ou 1, Xiaoyu Zhou 1, Jianwei Fei 2, Yunshu Dai 3, Zhihua Xia 1, Chip Hong Chang 4

0 citations

α

Published on arXiv

2603.23940

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Outperforms strong baselines in watermark robustness, localization accuracy, and recovery quality across realistic deepfake attack scenarios

VeriFi

Novel technique introduced


The proliferation of AIGC-driven face manipulation and deepfakes poses severe threats to media provenance, integrity, and copyright protection. Prior versatile watermarking systems typically rely on embedding explicit localization payloads, which introduces a fidelity--functionality trade-off: larger localization signals degrade visual quality and often reduce decoding robustness under strong generative edits. Moreover, existing methods rarely support content recovery, limiting their forensic value when original evidence must be reconstructed. To address these challenges, we present VeriFi, a versatile watermarking framework that unifies copyright protection, pixel-level manipulation localization, and high-fidelity face content recovery. VeriFi makes three key contributions: (1) it embeds a compact semantic latent watermark that serves as an content-preserving prior, enabling faithful restoration even after severe manipulations; (2) it achieves fine-grained localization without embedding localization-specific artifacts by correlating image features with decoded provenance signals; and (3) it introduces an AIGC attack simulator that combines latent-space mixing with seamless blending to improve robustness to realistic deepfake pipelines. Extensive experiments on CelebA-HQ and FFHQ show that VeriFi consistently outperforms strong baselines in watermark robustness, localization accuracy, and recovery quality, providing a practical and verifiable defense for deepfake forensics.


Key Contributions

  • Unified watermarking framework (VeriFi) supporting copyright protection, manipulation localization, and content recovery without explicit localization payloads
  • Semantic latent watermark enabling high-fidelity face restoration after severe deepfake manipulations
  • AIGC attack simulator combining latent-space mixing and seamless blending to improve robustness against realistic deepfake pipelines

🛡️ Threat Analysis

Output Integrity Attack

Proposes a content watermarking system that embeds provenance signals in face images to detect AIGC manipulations, localize tampering, and recover original content. This is output integrity and content authentication — the watermark is embedded IN THE OUTPUT (face images) to verify authenticity and enable forensic reconstruction, not in model weights for IP protection.


Details

Domains
visiongenerative
Model Types
diffusiongan
Threat Tags
inference_timedigital
Datasets
CelebA-HQFFHQ
Applications
deepfake detectionface manipulation forensicsmedia provenancecontent authenticity