defense 2026

mAVE: A Watermark for Joint Audio-Visual Generation Models

Luyang Si , Leyi Pan , Lijie Wen

0 citations

α

Published on arXiv

2603.07090

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

mAVE achieves >99% binding integrity against Swap Attacks on joint audio-visual generative models while maintaining original generation quality.

mAVE (Manifold Audio-Visual Entanglement)

Novel technique introduced


As Joint Audio-Visual Generation Models see widespread commercial deployment, embedding watermarks has become essential for protecting vendor copyright and ensuring content provenance. However, existing techniques suffer from an architectural mismatch by treating modalities as decoupled entities, exposing a critical Binding Vulnerability. Adversaries exploit this via Swap Attacks by replacing authentic audio with malicious deepfakes while retaining the watermarked video. Because current detectors rely on independent verification ($Video_{wm}\vee Audio_{wm}$), they incorrectly authenticate the manipulated content, falsely attributing harmful media to the original vendor and severely damaging their reputation. To address this, we propose mAVE (Manifold Audio-Visual Entanglement), the first watermarking framework natively designed for joint architectures. mAVE cryptographically binds audio and video latents at initialization without fine-tuning, defining a Legitimate Entanglement Manifold via Inverse Transform Sampling. Experiments on state-of-the-art models (LTX-2, MOVA) demonstrate that mAVE guarantees performance-losslessness and provides an exponential security bound against Swap Attacks. Achieving near-perfect binding integrity ($>99\%$), mAVE offers a robust cryptographic defense for vendor copyright.


Key Contributions

  • mAVE: first watermarking framework natively designed for joint audio-visual generation, cryptographically binding audio and video latents at initialization via Inverse Transform Sampling without fine-tuning
  • Theoretical guarantees of performance-losslessness (indistinguishability from standard Gaussian sampling) and an exponential security bound against Swap Attacks via Hoeffding's inequality
  • Experiments on LTX-2 and MOVA achieving >99% binding integrity, outperforming naive combinations of unimodal watermarks

🛡️ Threat Analysis

Output Integrity Attack

mAVE watermarks the OUTPUTS (generated audio and video content) of joint generative models to ensure content provenance and authenticate that both modalities originated from the same generation session. It defends against Swap Attacks — a content manipulation where adversaries replace authentic audio with a deepfake while retaining watermarked video to bypass independent detectors. This is classic output integrity / content provenance watermarking, not model-weight watermarking (ML05).


Details

Domains
multimodalaudiovisiongenerative
Model Types
diffusionmultimodal
Threat Tags
inference_timedigital
Datasets
LTX-2MOVA
Applications
ai-generated videoai-generated audiomultimedia content provenancevendor copyright protection