CoTDeceptor:Adversarial Code Obfuscation Against CoT-Enhanced LLM Code Agents
Haoyang Li 1, Mingjin Li , Jinxin Zuo 2,3, Siqi Li 1, Xiao Li 4, Hao Wu 4, Yueming Lu 1, Xiaochuan He 5
1 Beijing University of Posts and Telecommunications
Published on arXiv
2512.21250
Input Manipulation Attack
OWASP ML Top 10 — ML01
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
CoTDeceptor bypasses 14 out of 15 vulnerability categories against state-of-the-art CoT-enhanced LLMs, compared to only 2 bypassed by prior obfuscation methods.
CoTDeceptor
Novel technique introduced
LLM-based code agents(e.g., ChatGPT Codex) are increasingly deployed as detector for code review and security auditing tasks. Although CoT-enhanced LLM vulnerability detectors are believed to provide improved robustness against obfuscated malicious code, we find that their reasoning chains and semantic abstraction processes exhibit exploitable systematic weaknesses.This allows attackers to covertly embed malicious logic, bypass code review, and propagate backdoored components throughout real-world software supply chains.To investigate this issue, we present CoTDeceptor, the first adversarial code obfuscation framework targeting CoT-enhanced LLM detectors. CoTDeceptor autonomously constructs evolving, hard-to-reverse multi-stage obfuscation strategy chains that effectively disrupt CoT-driven detection logic.We obtained malicious code provided by security enterprise, experimental results demonstrate that CoTDeceptor achieves stable and transferable evasion performance against state-of-the-art LLMs and vulnerability detection agents. CoTDeceptor bypasses 14 out of 15 vulnerability categories, compared to only 2 bypassed by prior methods. Our findings highlight potential risks in real-world software supply chains and underscore the need for more robust and interpretable LLM-powered security analysis systems.
Key Contributions
- CoTDeceptor: the first adversarial code obfuscation framework specifically targeting CoT-enhanced LLM vulnerability detectors by exploiting their exposed reasoning chains
- Multi-stage evolving obfuscation strategy chains that are hard to reverse and autonomously constructed without expert effort
- Demonstrated transferable evasion across 14/15 vulnerability categories and multiple SOTA LLMs, far exceeding prior methods (2/15)
🛡️ Threat Analysis
CoTDeceptor crafts adversarial obfuscated code inputs that cause LLM-based vulnerability detectors to misclassify (fail to detect) malicious code at inference time — a classic evasion/input-manipulation attack achieving 14/15 vulnerability category bypasses.