SPDMark: Selective Parameter Displacement for Robust Video Watermarking

The advent of high-quality video generation models has amplified the need for robust watermarking schemes that can be used to reliably detect and track the provenance of generated videos. Existing video watermarking methods based on both post-hoc and in-generation approaches fail to simultaneously achieve imperceptibility, robustness, and computational efficiency. This work introduces a novel framework for in-generation video watermarking called SPDMark (pronounced `SpeedMark') based on selective parameter displacement of a video diffusion model. Watermarks are embedded into the generated videos by modifying a subset of parameters in the generative model. To make the problem tractable, the displacement is modeled as an additive composition of layer-wise basis shifts, where the final composition is indexed by the watermarking key. For parameter efficiency, this work specifically leverages low-rank adaptation (LoRA) to implement the basis shifts. During the training phase, the basis shifts and the watermark extractor are jointly learned by minimizing a combination of message recovery, perceptual similarity, and temporal consistency losses. To detect and localize temporal modifications in the watermarked videos, we use a cryptographic hashing function to derive frame-specific watermark messages from the given base watermarking key. During watermark extraction, maximum bipartite matching is applied to recover the correct frame order, even from temporally tampered videos. Evaluations on both text-to-video and image-to-video generation models demonstrate the ability of SPDMark to generate imperceptible watermarks that can be recovered with high accuracy and also establish its robustness against a variety of common video modifications.

Key Contributions

Selective parameter displacement via additive LoRA-based basis shifts indexed by a watermarking key for imperceptible in-generation video watermarking
Cryptographic hashing to derive frame-specific watermark messages, enabling temporal localization and detection of frame-level modifications
Maximum bipartite matching during extraction to recover correct frame order even from temporally tampered videos

🛡️ Threat Analysis

Output Integrity Attack

Embeds watermarks INTO the generated video outputs (not into model weights for ownership proof) by perturbing model parameters during generation. The goal is content provenance tracking and AI-generated video detection — classic output integrity / content watermarking covered by ML09. Parameter modification is the embedding mechanism, not the watermarked asset itself.

Details

Domains

visiongenerative

Model Types

diffusion

Threat Tags

training_timeinference_time

Applications

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

PoseGuard: Pose-Guided Generation with Safety Guardrails

Robust Concept Erasure in Diffusion Models: A Theoretical Perspective on Security and Robustness

VideoEraser: Concept Erasure in Text-to-Video Diffusion Models

EIRES:Training-free AI-Generated Image Detection via Edit-Induced Reconstruction Error Shift

Neighbor-Aware Localized Concept Erasure in Text-to-Image Diffusion Models

I2VWM: Robust Watermarking for Image to Video Generation

ALIEN: Analytic Latent Watermarking for Controllable Generation

A Difference-in-Difference Approach to Detecting AI-Generated Images