attack 2025

EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models via Equation Solving and Code Completion

Zhen Liang , Hai Huang , Zhengkui Chen

1 citations · 31 references · arXiv

α

Published on arXiv

2512.23173

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

EquaCode achieves 91.19% average attack success rate on the GPT series and 86.98% across 10 LLMs with only a single query, outperforming either the equation or code module alone.

EquaCode

Novel technique introduced


Large language models (LLMs), such as ChatGPT, have achieved remarkable success across a wide range of fields. However, their trustworthiness remains a significant concern, as they are still susceptible to jailbreak attacks aimed at eliciting inappropriate or harmful responses. However, existing jailbreak attacks mainly operate at the natural language level and rely on a single attack strategy, limiting their effectiveness in comprehensively assessing LLM robustness. In this paper, we propose Equacode, a novel multi-strategy jailbreak approach for large language models via equation-solving and code completion. This approach transforms malicious intent into a mathematical problem and then requires the LLM to solve it using code, leveraging the complexity of cross-domain tasks to divert the model's focus toward task completion rather than safety constraints. Experimental results show that Equacode achieves an average success rate of 91.19% on the GPT series and 98.65% across 3 state-of-the-art LLMs, all with only a single query. Further, ablation experiments demonstrate that EquaCode outperforms either the mathematical equation module or the code module alone. This suggests a strong synergistic effect, thereby demonstrating that multi-strategy approach yields results greater than the sum of its parts.


Key Contributions

  • Multi-strategy jailbreak combining equation-solving and code completion to exploit safety alignment gaps in non-natural-language domains
  • Single-query attack achieving 91.19% success rate on GPT series and 86.98% across 10 LLMs
  • Ablation study demonstrating strong synergistic effect between the math and code modules beyond either strategy alone

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_timetargeted
Datasets
GPT series (GPT-3.5, GPT-4)10 state-of-the-art LLMs (unspecified in excerpt)
Applications
general-purpose llm chatbotscode-capable llms