defense 2025

E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis

0 citations · 53 references · arXiv

Published on arXiv

2511.07099

Input Manipulation Attack

OWASP ML Top 10 — ML01

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

E2E-VGuard successfully protects timbre and pronunciation against 16 open-source synthesizers and 3 commercial APIs across Chinese and English, validated in real-world deployment.

E2E-VGuard

Novel technique introduced

Recent advancements in speech synthesis technology have enriched our daily lives, with high-quality and human-like audio widely adopted across real-world applications. However, malicious exploitation like voice-cloning fraud poses severe security risks. Existing defense techniques struggle to address the production large language model (LLM)-based speech synthesis. While previous studies have considered the protection for fine-tuning synthesizers, they assume manually annotated transcripts. Given the labor intensity of manual annotation, end-to-end (E2E) systems leveraging automatic speech recognition (ASR) to generate transcripts are becoming increasingly prevalent, e.g., voice cloning via commercial APIs. Therefore, this E2E speech synthesis also requires new security mechanisms. To tackle these challenges, we propose E2E-VGuard, a proactive defense framework for two emerging threats: (1) production LLM-based speech synthesis, and (2) the novel attack arising from ASR-driven E2E scenarios. Specifically, we employ the encoder ensemble with a feature extractor to protect timbre, while ASR-targeted adversarial examples disrupt pronunciation. Moreover, we incorporate the psychoacoustic model to ensure perturbative imperceptibility. For a comprehensive evaluation, we test 16 open-source synthesizers and 3 commercial APIs across Chinese and English datasets, confirming E2E-VGuard's effectiveness in timbre and pronunciation protection. Real-world deployment validation is also conducted. Our code and demo page are available at https://wxzyd123.github.io/e2e-vguard/.

Key Contributions

Encoder ensemble with feature extractor to craft adversarial perturbations that protect timbre from LLM-based speech synthesizers
ASR-targeted adversarial examples that disrupt pronunciation in end-to-end voice cloning pipelines relying on automatic transcription
Psychoacoustic model integration to ensure perturbations are imperceptible to human listeners

🛡️ Threat Analysis

Input Manipulation Attack

Core technical contribution is crafting adversarial perturbations — specifically encoder-ensemble-targeted examples for timbre protection and ASR-targeted adversarial examples for pronunciation disruption — that fool speech synthesis pipeline components at inference time. The adversarial example methodology (gradient-based, constrained by psychoacoustic model) is the primary technical contribution.

Output Integrity Attack

The application goal is protecting audio content integrity against voice cloning fraud (deepfake audio generation). Anti-voice-cloning perturbations are the audio analog of anti-deepfake image perturbations, which fall under output integrity and content authenticity protection in ML09.

Details

Domains

audionlp

Model Types

llmtransformer

Threat Tags

white_boxinference_timedigital

Datasets

Chinese speech datasetsEnglish speech datasets

Applications

voice cloning preventionspeech synthesis protectioncommercial tts apis

Read PDF arXiv DOI Code

E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

DAMASHA: Detecting AI in Mixed Adversarial Texts via Segmentation with Human-interpretable Attribution

Unifying Speech Editing Detection and Content Localization via Prior-Enhanced Audio LLMs

Towards Robust Speech Deepfake Detection via Human-Inspired Reasoning

Analyzing Reasoning Shifts in Audio Deepfake Detection under Adversarial Attacks: The Reasoning Tax versus Shield Bifurcation

Modeling the Attack: Detecting AI-Generated Text by Quantifying Adversarial Perturbations

Improving Detection of Watermarked Language Models

DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack

Provable Adversarial Robustness in In-Context Learning