defense 2025

Vid-Freeze: Protecting Images from Malicious Image-to-Video Generation via Temporal Freezing

Rohit Chowdhury , Aniruddha Bala , Rohan Jaiswal , Siddharth Roheda

0 citations · 18 references · arXiv

α

Published on arXiv

2509.23279

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Vid-Freeze achieves near-complete temporal freezing of I2V-generated videos with perturbation budgets as small as 2 pixels, surpassing I2VGuard which leaves residual motion dynamics.

Vid-Freeze

Novel technique introduced


The rapid progress of image-to-video (I2V) generation models has introduced significant risks, enabling video synthesis from static images and facilitating deceptive or malicious content creation. While prior defenses such as I2VGuard attempt to immunize images, effective and principled protection to block motion remains underexplored. In this work, we introduce Vid-Freeze - a novel attention-suppressing adversarial attack that adds carefully crafted adversarial perturbations to images. Our method explicitly targets the attention mechanism of I2V models, completely disrupting motion synthesis while preserving semantic fidelity of the input image. The resulting immunized images generate stand-still or near-static videos, effectively blocking malicious content creation. Our experiments demonstrate the impressive protection provided by the proposed approach, highlighting the importance of attention attacks as a promising direction for robust and proactive defenses against misuse of I2V generation models.


Key Contributions

  • Principled layer-selection strategy identifying the most vulnerable attention layers in I2V diffusion transformers (CogVideoX, SVD)
  • Attention suppression loss that collapses generated video onto the static input frame, achieving complete temporal freezing
  • Demonstrates temporal freezing with minimal perturbation budgets — sometimes modifying as few as 2 pixels — outperforming prior I2VGuard defense

🛡️ Threat Analysis

Output Integrity Attack

Protects image content integrity from malicious AI-generated video creation — a proactive defense that prevents I2V models from animating protected images. Parallels the class of protective adversarial perturbations (PhotoGuard, Mist, Glaze) whose removal would itself be an ML09 output-integrity attack. The goal is preventing malicious AI-generated content, making this fundamentally a content protection/integrity contribution.


Details

Domains
visiongenerative
Model Types
diffusiontransformer
Threat Tags
white_boxinference_time
Datasets
CogVideoXStable Video Diffusion
Applications
image-to-video generationvideo synthesisdeepfake prevention