TASO: Jailbreak LLMs via Alternative Template and Suffix Optimization
Yanting Wang , Runpeng Geng , Jinghui Chen , Minhao Cheng , Jinyuan Jia
Published on arXiv
2511.18581
Input Manipulation Attack
OWASP ML Top 10 — ML01
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
TASO successfully jailbreaks 24 leading LLMs including models from the Llama family, OpenAI, and DeepSeek, outperforming prior jailbreak techniques that optimize only templates or suffixes independently
TASO
Novel technique introduced
Many recent studies showed that LLMs are vulnerable to jailbreak attacks, where an attacker can perturb the input of an LLM to induce it to generate an output for a harmful question. In general, existing jailbreak techniques either optimize a semantic template intended to induce the LLM to produce harmful outputs or optimize a suffix that leads the LLM to initiate its response with specific tokens (e.g., "Sure"). In this work, we introduce TASO (Template and Suffix Optimization), a novel jailbreak method that optimizes both a template and a suffix in an alternating manner. Our insight is that suffix optimization and template optimization are complementary to each other: suffix optimization can effectively control the first few output tokens but cannot control the overall quality of the output, while template optimization provides guidance for the entire output but cannot effectively control the initial tokens, which significantly impact subsequent responses. Thus, they can be combined to improve the attack's effectiveness. We evaluate the effectiveness of TASO on benchmark datasets (including HarmBench and AdvBench) on 24 leading LLMs (including models from the Llama family, OpenAI, and DeepSeek). The results demonstrate that TASO can effectively jailbreak existing LLMs. We hope our work can inspire future studies in exploring this direction.
Key Contributions
- Proposes TASO, a jailbreak method that alternately optimizes a semantic jailbreak template and an adversarial suffix, exploiting their complementary strengths
- Demonstrates that suffix optimization controls initial output tokens while template optimization guides overall output quality, and that combining them yields stronger attacks
- Evaluates TASO on 24 leading LLMs (Llama, OpenAI, DeepSeek) across HarmBench and AdvBench, showing TASO outperforms existing jailbreak methods
🛡️ Threat Analysis
The suffix optimization component uses gradient-based token-level perturbations (adversarial suffix style, similar to GCG) to force specific initial output tokens — this is adversarial suffix optimization on LLMs, qualifying as ML01 per the guidelines.