AWARE: Audio Watermarking with Adversarial Resistance to Edits

Prevailing practice in learning-based audio watermarking is to pursue robustness by expanding the set of simulated distortions during training. However, such surrogates are narrow and prone to overfitting. This paper presents AWARE (Audio Watermarking with Adversarial Resistance to Edits), an alternative approach that avoids reliance on attack-simulation stacks and handcrafted differentiable distortions. Embedding is obtained via adversarial optimization in the time-frequency domain under a level-proportional perceptual budget. Detection employs a time-order-agnostic detector with a Bitwise Readout Head (BRH) that aggregates temporal evidence into one score per watermark bit, enabling reliable watermark decoding even under desynchronization and temporal cuts. Empirically, AWARE attains high audio quality and speech intelligibility (PESQ/STOI) and consistently low BER across various audio edits, often surpassing representative state-of-the-art learning-based audio watermarking systems.

Key Contributions

Adversarial watermark embedding in the time-frequency domain under a level-proportional perceptual budget, avoiding reliance on handcrafted differentiable distortion surrogates
Time-order-agnostic detector with a Bitwise Readout Head (BRH) that aggregates temporal evidence per bit, enabling robust decoding under desynchronization and temporal cuts
Empirical demonstration of consistently low BER across diverse audio edits while maintaining high PESQ/STOI quality scores, surpassing state-of-the-art audio watermarking baselines

🛡️ Threat Analysis

Output Integrity Attack

AWARE embeds watermarks in audio content (outputs) to support provenance tracking and authenticity of AI-generated audio — classic output integrity / content watermarking. The watermark is in the audio signal itself, not in model weights.

Details

Domains

audiogenerative

Model Types

gandiffusion

Threat Tags

digitaltraining_time

Applications

2025 0 cit.

Output Integrity Attack

60%

AWARE: Audio Watermarking with Adversarial Resistance to Edits

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

VocBulwark: Towards Practical Generative Speech Watermarking via Additional-Parameter Injection

Smark: A Watermark for Text-to-Speech Diffusion Models via Discrete Wavelet Transform

AFSS: Artifact-Focused Self-Synthesis for Mitigating Bias in Audio Deepfake Detection

RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting

High-Fidelity Face Content Recovery via Tamper-Resilient Versatile Watermarking

Anti-Tamper Protection for Unauthorized Individual Image Generation

Efficient Zero-Shot AI-Generated Image Detection

Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification