Bit-Flip Vulnerability of Shared KV-Cache Blocks in LLM Serving Systems

Rowhammer on GPU DRAM has enabled adversarial bit flips in model weights; shared KV-cache blocks in LLM serving systems present an analogous but previously unexamined target. In vLLM's Prefix Caching, these blocks exist as a single physical copy without integrity protection. Using software fault injection under ideal bit targeting, we characterize worst-case severity and identify three properties: (1) Silent divergence - 13 of 16 BF16 bit positions produce coherent but altered outputs, indistinguishable from legitimate responses without a clean baseline. (2) Selective propagation - only requests sharing the targeted prefix are affected. (3) Persistent accumulation - no temporal decay occurs, so cumulative damage grows linearly with subsequent requests. Together, these constitute a threat profile distinct from weight corruption: silent divergence and selective propagation enable detection evasion; persistent accumulation then proceeds unchecked, yielding damage amplification bounded only by how long the block remains cached. A checksum-based countermeasure detects any single-bit corruption at scheduling time, bounding cumulative damage to one batch independent of the block's cache lifetime, with negligible overhead. These results argue for integrity protection of prefix blocks before end-to-end exploitation is demonstrated.

Key Contributions

First characterization of bit-flip attacks on shared KV-cache blocks in LLM serving systems, identifying three threat properties: silent divergence (13/16 BF16 bits produce coherent altered outputs), selective propagation (only affected prefix users), and persistent accumulation (damage grows linearly)
Demonstration that KV-cache corruption creates a distinct threat profile from weight corruption—detection evasion via silent divergence plus unbounded damage amplification
Checksum-based integrity protection countermeasure that bounds cumulative damage to one batch with negligible overhead

🛡️ Threat Analysis

AI Supply Chain Attacks

Attacks the LLM serving INFRASTRUCTURE specifically—shared KV-cache blocks in vLLM's Prefix Caching system—exploiting a vulnerability in the deployment pipeline rather than the model itself. The threat vector is infrastructure-level memory corruption (analogous to Rowhammer on GPU DRAM) targeting the serving system's caching mechanism.

Output Integrity Attack

The attack corrupts cached intermediate representations that directly affect output integrity. The three characterized properties (silent divergence producing altered but coherent outputs, selective propagation, persistent accumulation) all relate to tampering with model outputs via corrupted cache state. The checksum-based countermeasure verifies cache block integrity to prevent output corruption.

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_timetargeted

Applications

2025 0 cit.

Output Integrity Attack

57%