DRAINCODE: Stealthy Energy Consumption Attacks on Retrieval-Augmented Code Generation via Context Poisoning
Yanlin Wang 1, Jiadong Wu 1, Tianyue Jiang 1, Mingwei Liu 1, Jiachi Chen 1, Chong Wang 2, Ensheng Shi 3, Xilin Liu 3, Yuchi Ma 3, Zibin Zheng 1
Published on arXiv
2601.20615
Model Denial of Service
OWASP LLM Top 10 — LLM04
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
DrainCode achieves up to 85% increase in GPU latency, 49% increase in energy consumption, and more than 3x increase in output length on RAG-based code generation LLMs compared to baseline.
DrainCode
Novel technique introduced
Large language models (LLMs) have demonstrated impressive capabilities in code generation by leveraging retrieval-augmented generation (RAG) methods. However, the computational costs associated with LLM inference, particularly in terms of latency and energy consumption, have received limited attention in the security context. This paper introduces DrainCode, the first adversarial attack targeting the computational efficiency of RAG-based code generation systems. By strategically poisoning retrieval contexts through a mutation-based approach, DrainCode forces LLMs to produce significantly longer outputs, thereby increasing GPU latency and energy consumption. We evaluate the effectiveness of DrainCode across multiple models. Our experiments show that DrainCode achieves up to an 85% increase in latency, a 49% increase in energy consumption, and more than a 3x increase in output length compared to the baseline. Furthermore, we demonstrate the generalizability of the attack across different prompting strategies and its effectiveness compared to different defenses. The results highlight DrainCode as a potential method for increasing the computational overhead of LLMs, making it useful for evaluating LLM security in resource-constrained environments. We provide code and data at https://github.com/DeepSoftwareAnalytics/DrainCode.
Key Contributions
- First adversarial attack targeting computational efficiency (latency and energy) of RAG-based code generation LLMs rather than output correctness
- Mutation-based context poisoning strategy that injects adversarially crafted code snippets into retrieval databases to force verbose LLM outputs
- Empirical evaluation across multiple LLMs showing up to 85% latency increase, 49% energy increase, and >3x output length increase, with analysis of defenses and prompting strategies