defense 2025

WaTeRFlow: Watermark Temporal Robustness via Flow Consistency

Utae Jeong 1, Sumin In 1, Hyunju Ryu 1, Jaewan Choi 1, Feng Yang 2, Jongheon Jeong 1, Seungryong Kim 3, Sangpil Kim 1

0 citations · 90 references · arXiv

α

Published on arXiv

2512.19048

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

WaTeRFlow achieves higher first-frame and per-frame watermark bit accuracy across representative I2V models with resilience to distortions applied before or after video generation.

WaTeRFlow

Novel technique introduced


Image watermarking supports authenticity and provenance, yet many schemes are still easy to bypass with various distortions and powerful generative edits. Deep learning-based watermarking has improved robustness to diffusion-based image editing, but a gap remains when a watermarked image is converted to video by image-to-video (I2V), in which per-frame watermark detection weakens. I2V has quickly advanced from short, jittery clips to multi-second, temporally coherent scenes, and it now serves not only content creation but also world-modeling and simulation workflows, making cross-modal watermark recovery crucial. We present WaTeRFlow, a framework tailored for robustness under I2V. It consists of (i) FUSE (Flow-guided Unified Synthesis Engine), which exposes the encoder-decoder to realistic distortions via instruction-driven edits and a fast video diffusion proxy during training, (ii) optical-flow warping with a Temporal Consistency Loss (TCL) that stabilizes per-frame predictions, and (iii) a semantic preservation loss that maintains the conditioning signal. Experiments across representative I2V models show accurate watermark recovery from frames, with higher first-frame and per-frame bit accuracy and resilience when various distortions are applied before or after video generation.


Key Contributions

  • FUSE (Flow-guided Unified Synthesis Engine): exposes the watermark encoder-decoder to realistic I2V distortions via instruction-driven edits and a fast video diffusion proxy during training to improve cross-modal robustness
  • Optical-flow warping combined with a Temporal Consistency Loss (TCL) that stabilizes per-frame watermark predictions across video frames
  • Semantic preservation loss that maintains the conditioning signal integrity during watermark training

🛡️ Threat Analysis

Output Integrity Attack

WaTeRFlow is a content watermarking framework that embeds marks in images and recovers them from I2V-generated video frames to verify provenance and copyright — this is squarely output integrity and content provenance. The watermark is in the image content/output, not in model weights, so this is ML09 not ML05.


Details

Domains
visiongenerative
Model Types
diffusion
Threat Tags
inference_timedigital
Applications
image copyright protectionvideo content provenancecross-modal watermark recovery