attack 2026

Transient Turn Injection: Exposing Stateless Multi-Turn Vulnerabilities in Large Language Models

Naheed Rayhan ¹, Sohely Jahan ²

¹ Jagannath University

² University of Barishal

0 citations

Published on arXiv

2604.21860

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Discovers previously unknown model-specific vulnerabilities across state-of-the-art LLMs, with only select architectures showing substantial inherent robustness to multi-turn stateless jailbreak attacks

Transient Turn Injection (TTI)

Novel technique introduced

Large language models (LLMs) are increasingly integrated into sensitive workflows, raising the stakes for adversarial robustness and safety. This paper introduces Transient Turn Injection(TTI), a new multi-turn attack technique that systematically exploits stateless moderation by distributing adversarial intent across isolated interactions. TTI leverages automated attacker agents powered by large language models to iteratively test and evade policy enforcement in both commercial and open-source LLMs, marking a departure from conventional jailbreak approaches that typically depend on maintaining persistent conversational context. Our extensive evaluation across state-of-the-art models-including those from OpenAI, Anthropic, Google Gemini, Meta, and prominent open-source alternatives-uncovers significant variations in resilience to TTI attacks, with only select architectures exhibiting substantial inherent robustness. Our automated blackbox evaluation framework also uncovers previously unknown model specific vulnerabilities and attack surface patterns, especially within medical and high stakes domains. We further compare TTI against established adversarial prompting methods and detail practical mitigation strategies, such as session level context aggregation and deep alignment approaches. Our study underscores the urgent need for holistic, context aware defenses and continuous adversarial testing to future proof LLM deployments against evolving multi-turn threats.

Key Contributions

Novel Transient Turn Injection (TTI) attack technique exploiting stateless moderation in multi-turn LLM conversations
Automated black-box evaluation framework using LLM-powered attacker agents to systematically test multi-turn jailbreak resilience
Comprehensive evaluation across commercial (OpenAI, Anthropic, Google Gemini, Meta) and open-source LLMs revealing significant variation in robustness to multi-turn attacks

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

black_boxinference_timetargeted

Applications

conversational aichatbotsllm assistants

Read PDF arXiv

Transient Turn Injection: Exposing Stateless Multi-Turn Vulnerabilities in Large Language Models

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference

Emoji-Based Jailbreaking of Large Language Models

Can You Trick the Grader? Adversarial Persuasion of LLM Judges

In-Context Representation Hijacking

Quant Fever, Reasoning Blackholes, Schrodinger's Compliance, and More: Probing GPT-OSS-20B

Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges

AIP: Subverting Retrieval-Augmented Generation via Adversarial Instructional Prompt

Structured Semantic Cloaking for Jailbreak Attacks on Large Language Models