From Similarity to Vulnerability: Key Collision Attack on LLM Semantic Caching
Zhixiang Zhang 1, Zesen Liu 1, Yuchong Xie 1, Quanfeng Huang 2, Dongdong She 1
Published on arXiv
2601.23088
Output Integrity Attack
OWASP ML Top 10 — ML09
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
CacheAttack achieves an 86% cache hit rate in LLM response hijacking and can induce malicious agent behaviors, with transferability across embedding models from providers including AWS and Microsoft.
CacheAttack
Novel technique introduced
Semantic caching has emerged as a pivotal technique for scaling LLM applications, widely adopted by major providers including AWS and Microsoft. By utilizing semantic embedding vectors as cache keys, this mechanism effectively minimizes latency and redundant computation for semantically similar queries. In this work, we conceptualize semantic cache keys as a form of fuzzy hashes. We demonstrate that the locality required to maximize cache hit rates fundamentally conflicts with the cryptographic avalanche effect necessary for collision resistance. Our conceptual analysis formalizes this inherent trade-off between performance (locality) and security (collision resilience), revealing that semantic caching is naturally vulnerable to key collision attacks. While prior research has focused on side-channel and privacy risks, we present the first systematic study of integrity risks arising from cache collisions. We introduce CacheAttack, an automated framework for launching black-box collision attacks. We evaluate CacheAttack in security-critical tasks and agentic workflows. It achieves a hit rate of 86\% in LLM response hijacking and can induce malicious behaviors in LLM agent, while preserving strong transferability across different embedding models. A case study on a financial agent further illustrates the real-world impact of these vulnerabilities. Finally, we discuss mitigation strategies.
Key Contributions
- Formalizes the fundamental security-performance trade-off in LLM semantic caching, showing that locality (required for cache hits) inherently conflicts with collision resistance (required for integrity)
- Introduces CacheAttack, the first automated black-box framework for launching cache key collision attacks against LLM semantic caches
- Demonstrates 86% response hijacking rate and successful induction of malicious LLM agent behaviors, with strong transferability across different embedding models; includes a financial agent case study
🛡️ Threat Analysis
The paper explicitly frames itself as studying 'integrity risks arising from cache collisions.' The attack causes LLM systems to serve wrong/attacker-chosen cached responses in place of legitimate fresh outputs — a direct output integrity violation. The paper's core contribution is demonstrating that semantic cache keys are inherently collision-vulnerable, leading to manipulated/hijacked responses delivered to users.