attack 2025

Cryptanalysis of Pseudorandom Error-Correcting Codes

Tianrui Wang ¹, Anyu Wang ¹, Tianshuo Cong ², Delong Ran ¹, Jinyuan Liu ¹, Xiaoyun Wang ¹

¹ Tsinghua University

² Shandong University

0 citations · 48 references · IACR ePrint

Published on arXiv

2512.17310

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Attacks break PRC watermark undetectability with overwhelming probability at a cost of only 2^22 operations, validated on real-world DeepSeek and Stable Diffusion models, and PRC cannot achieve 128-bit security even with proposed parameter fixes.

Meet-in-the-Middle Attack on PRC

Novel technique introduced

Pseudorandom error-correcting codes (PRC) is a novel cryptographic primitive proposed at CRYPTO 2024. Due to the dual capability of pseudorandomness and error correction, PRC has been recognized as a promising foundational component for watermarking AI-generated content. However, the security of PRC has not been thoroughly analyzed, especially with concrete parameters or even in the face of cryptographic attacks. To fill this gap, we present the first cryptanalysis of PRC. We first propose three attacks to challenge the undetectability and robustness assumptions of PRC. Among them, two attacks aim to distinguish PRC-based codewords from plain vectors, and one attack aims to compromise the decoding process of PRC. Our attacks successfully undermine the claimed security guarantees across all parameter configurations. Notably, our attack can detect the presence of a watermark with overwhelming probability at a cost of $2^{22}$ operations. We also validate our approach by attacking real-world large generative models such as DeepSeek and Stable Diffusion. To mitigate our attacks, we further propose three defenses to enhance the security of PRC, including parameter suggestions, implementation suggestions, and constructing a revised key generation algorithm. Our proposed revised key generation function effectively prevents the occurrence of weak keys. However, we highlight that the current PRC-based watermarking scheme still cannot achieve a 128-bit security under our parameter suggestions due to the inherent configurations of large generative models, such as the maximum output length of large language models.

Key Contributions

Three novel attacks against PRC: two distinguishing attacks (break undetectability, separating PRC codewords from random vectors) and one decoding-compromise attack (breaks robustness), collectively undermining all claimed PRC security guarantees across all parameter configurations.
Real-world validation of the attacks against DeepSeek (LLM) and Stable Diffusion, detecting embedded watermarks with overwhelming probability at only 2^22 operations using a meet-in-the-middle technique.
Three mitigations including revised key generation to eliminate weak keys, with the finding that PRC-based watermarking still cannot achieve 128-bit security due to inherent LLM output-length constraints.

🛡️ Threat Analysis

Output Integrity Attack

PRC (Pseudorandom Error-Correcting Codes) is a cryptographic primitive specifically designed to watermark AI-generated content (LLM text, diffusion-model images) for output provenance. This paper attacks the watermarking scheme itself — defeating undetectability (detecting that a watermark is present) and compromising decoding (breaking robustness) — which are attacks on AI output integrity and content authentication, directly targeting ML09.

Details

Domains

nlpgenerativevision

Model Types

llmdiffusion

Threat Tags

black_boxinference_time

Datasets

DeepSeek-generated textStable Diffusion-generated images

Applications

ai-generated content watermarkingllm text watermarkingimage generation watermarking

Read PDF arXiv DOI

Cryptanalysis of Pseudorandom Error-Correcting Codes

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

Naïve Exposure of Generative AI Capabilities Undermines Deepfake Detection

SHIFT: Stochastic Hidden-Trajectory Deflection for Removing Diffusion-based Watermark

Identifying Models Behind Text-to-Image Leaderboards

Nondeterminism-Aware Optimistic Verification for Floating-Point Neural Networks

MarkCleaner: High-Fidelity Watermark Removal via Imperceptible Micro-Geometric Perturbation

Understanding Semantic Perturbations on In-Processing Generative Image Watermarks

Text-to-Image Models Leave Identifiable Signatures: Implications for Leaderboard Security