RobustSora: De-Watermarked Benchmark for Robust AI-Generated Video Detection

The proliferation of AI-generated video technologies poses challenges to information integrity. While recent benchmarks advance AIGC video detection, they overlook a critical factor: many state-of-the-art generative models embed digital watermarks in outputs, and detectors may partially rely on these patterns. To evaluate this influence, we present RobustSora, the benchmark designed to assess watermark robustness in AIGC video detection. We systematically construct a dataset of 6,500 videos comprising four types: Authentic-Clean (A-C), Authentic-Spoofed with fake watermarks (A-S), Generated-Watermarked (G-W), and Generated-DeWatermarked (G-DeW). Our benchmark introduces two evaluation tasks: Task-I tests performance on watermark-removed AI videos, while Task-II assesses false alarm rates on authentic videos with fake watermarks. Experiments with ten models spanning specialized AIGC detectors, transformer architectures, and MLLM approaches reveal performance variations of 2-8pp under watermark manipulation. Transformer-based models show consistent moderate dependency (6-8pp), while MLLMs exhibit diverse patterns (2-8pp). These findings indicate partial watermark dependency and highlight the need for watermark-aware training strategies. RobustSora provides essential tools to advance robust AIGC detection research.

Key Contributions

RobustSora dataset of 6,500 videos across four types (Authentic-Clean, Authentic-Spoofed, Generated-Watermarked, Generated-DeWatermarked) for evaluating watermark robustness in AIGC video detection
Two novel evaluation tasks: Task-I (watermark erasure robustness on de-watermarked AI videos) and Task-II (false alarm rate on authentic videos with fake watermarks)
Comprehensive experiments across ten detectors revealing 2-8pp performance variation due to watermark manipulation, with transformer models showing consistent 6-8pp dependency

🛡️ Threat Analysis

Output Integrity Attack

The benchmark directly evaluates AI-generated content (video) detection systems — core ML09 territory — and specifically probes how watermark removal attacks (de-watermarking) and watermark spoofing attacks degrade or distort detection performance, addressing output integrity and content provenance authenticity.

Details

Domains

vision

Model Types

transformervlm

Threat Tags

inference_timedigital

Datasets

RobustSoraVriptDVFUltraVideo

Applications

2026 0 cit.

Output Integrity Attack

75%