attack 2025

CoTDeceptor:Adversarial Code Obfuscation Against CoT-Enhanced LLM Code Agents

Haoyang Li 1, Mingjin Li , Jinxin Zuo 2,3, Siqi Li 1, Xiao Li 4, Hao Wu 4, Yueming Lu 1, Xiaochuan He 5

0 citations · 39 references · arXiv

α

Published on arXiv

2512.21250

Input Manipulation Attack

OWASP ML Top 10 — ML01

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

CoTDeceptor bypasses 14 out of 15 vulnerability categories against state-of-the-art CoT-enhanced LLMs, compared to only 2 bypassed by prior obfuscation methods.

CoTDeceptor

Novel technique introduced


LLM-based code agents(e.g., ChatGPT Codex) are increasingly deployed as detector for code review and security auditing tasks. Although CoT-enhanced LLM vulnerability detectors are believed to provide improved robustness against obfuscated malicious code, we find that their reasoning chains and semantic abstraction processes exhibit exploitable systematic weaknesses.This allows attackers to covertly embed malicious logic, bypass code review, and propagate backdoored components throughout real-world software supply chains.To investigate this issue, we present CoTDeceptor, the first adversarial code obfuscation framework targeting CoT-enhanced LLM detectors. CoTDeceptor autonomously constructs evolving, hard-to-reverse multi-stage obfuscation strategy chains that effectively disrupt CoT-driven detection logic.We obtained malicious code provided by security enterprise, experimental results demonstrate that CoTDeceptor achieves stable and transferable evasion performance against state-of-the-art LLMs and vulnerability detection agents. CoTDeceptor bypasses 14 out of 15 vulnerability categories, compared to only 2 bypassed by prior methods. Our findings highlight potential risks in real-world software supply chains and underscore the need for more robust and interpretable LLM-powered security analysis systems.


Key Contributions

  • CoTDeceptor: the first adversarial code obfuscation framework specifically targeting CoT-enhanced LLM vulnerability detectors by exploiting their exposed reasoning chains
  • Multi-stage evolving obfuscation strategy chains that are hard to reverse and autonomously constructed without expert effort
  • Demonstrated transferable evasion across 14/15 vulnerability categories and multiple SOTA LLMs, far exceeding prior methods (2/15)

🛡️ Threat Analysis

Input Manipulation Attack

CoTDeceptor crafts adversarial obfuscated code inputs that cause LLM-based vulnerability detectors to misclassify (fail to detect) malicious code at inference time — a classic evasion/input-manipulation attack achieving 14/15 vulnerability category bypasses.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
black_boxinference_timetargeteddigital
Applications
code vulnerability detectionllm-based code reviewsecurity auditingci pipeline security