GIFGuard: Proactive Forensics against Deepfakes in Facial GIFs via Spatiotemporal Watermarking
Shupeng Che 1, Zhiqing Guo 1, Changtao Miao 2, Dan Ma 1, Gaobo Yang 3
Published on arXiv
2604.26519
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves high-fidelity visual quality with robust watermark extraction even under severe deepfake facial manipulation of GIF sequences
GIFGuard
Novel technique introduced
The rapid evolution of deepfake technology poses an unprecedented threat to the authenticity of Graphics Interchange Format (GIF) imagery, which serves as a representative of short-loop temporal media in social networks. However, existing proactive forensics works are designed for static images, which limits their applicability to animated GIFs. To bridge this gap, we propose GIFGuard, the first spatiotemporal watermarking framework tailored for deepfake proactive forensics in GIFs. In the embedding stage, we propose the Spatiotemporal Adaptive Residual Encoder (STARE) to ensure robustness against high-level semantic tampering. It employs a 3D convolutional backbone with adaptive channel recalibration to capture globally coherent temporal dependencies. In the extraction stage, we design the Deep Integrity Restoration Decoder (DIRD). It utilizes a spatiotemporal hourglass architecture equipped with 3D attention to restore latent features, allowing for the accurate extraction of watermark signals even under severe facial manipulation. Furthermore, we construct GIFfaces, the first large-scale benchmark dataset curated for GIF proactive forensics to facilitate research in this domain. Extensive results show that GIFGuard achieves high-fidelity visual quality and remarkable robustness performance against deepfakes. Related code and dataset will be released.
Key Contributions
- First spatiotemporal watermarking framework specifically designed for proactive forensics in animated GIFs against deepfakes
- STARE encoder using 3D convolutions with adaptive channel recalibration to capture temporal dependencies and ensure watermark temporal consistency
- DIRD decoder with spatiotemporal hourglass architecture and 3D attention for extracting watermarks under severe facial manipulation
- GIFfaces benchmark dataset for GIF proactive forensics research
🛡️ Threat Analysis
Embeds watermarks in GIF content outputs to verify authenticity and provenance, enabling detection of deepfake manipulation. This is content watermarking (output integrity), not model watermarking. The watermark is embedded in the media content itself to trace whether it has been tampered with by deepfakes.