attack 2026

Bit-Flip Vulnerability of Shared KV-Cache Blocks in LLM Serving Systems

Yuji Yamamoto , Satoshi Matsuura

0 citations

α

Published on arXiv

2604.17249

AI Supply Chain Attacks

OWASP ML Top 10 — ML06

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Software fault injection shows 13 of 16 BF16 bit positions in KV-cache blocks produce silent coherent output divergence, with cumulative damage growing linearly and unbounded by cache lifetime until detection

KV-Cache Bit-Flip Attack

Novel technique introduced


Rowhammer on GPU DRAM has enabled adversarial bit flips in model weights; shared KV-cache blocks in LLM serving systems present an analogous but previously unexamined target. In vLLM's Prefix Caching, these blocks exist as a single physical copy without integrity protection. Using software fault injection under ideal bit targeting, we characterize worst-case severity and identify three properties: (1) Silent divergence - 13 of 16 BF16 bit positions produce coherent but altered outputs, indistinguishable from legitimate responses without a clean baseline. (2) Selective propagation - only requests sharing the targeted prefix are affected. (3) Persistent accumulation - no temporal decay occurs, so cumulative damage grows linearly with subsequent requests. Together, these constitute a threat profile distinct from weight corruption: silent divergence and selective propagation enable detection evasion; persistent accumulation then proceeds unchecked, yielding damage amplification bounded only by how long the block remains cached. A checksum-based countermeasure detects any single-bit corruption at scheduling time, bounding cumulative damage to one batch independent of the block's cache lifetime, with negligible overhead. These results argue for integrity protection of prefix blocks before end-to-end exploitation is demonstrated.


Key Contributions

  • First characterization of bit-flip attacks on shared KV-cache blocks in LLM serving systems, identifying three threat properties: silent divergence (13/16 BF16 bits produce coherent altered outputs), selective propagation (only affected prefix users), and persistent accumulation (damage grows linearly)
  • Demonstration that KV-cache corruption creates a distinct threat profile from weight corruption—detection evasion via silent divergence plus unbounded damage amplification
  • Checksum-based integrity protection countermeasure that bounds cumulative damage to one batch with negligible overhead

🛡️ Threat Analysis

AI Supply Chain Attacks

Attacks the LLM serving INFRASTRUCTURE specifically—shared KV-cache blocks in vLLM's Prefix Caching system—exploiting a vulnerability in the deployment pipeline rather than the model itself. The threat vector is infrastructure-level memory corruption (analogous to Rowhammer on GPU DRAM) targeting the serving system's caching mechanism.

Output Integrity Attack

The attack corrupts cached intermediate representations that directly affect output integrity. The three characterized properties (silent divergence producing altered but coherent outputs, selective propagation, persistent accumulation) all relate to tampering with model outputs via corrupted cache state. The checksum-based countermeasure verifies cache block integrity to prevent output corruption.


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_timetargeted
Applications
llm serving systemsvllm prefix caching