attack 2026

ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models

0 citations · 53 references · arXiv

Published on arXiv

2602.00154

Model Denial of Service

OWASP LLM Top 10 — LLM04

Key Finding

ReasoningBomb induces 286.7x input-to-output token amplification on average, outperforms the best baseline by 38% in reasoning tokens, and evades detection with >98.4% bypass rate against dual-stage joint detection.

ReasoningBomb

Novel technique introduced

Large reasoning models (LRMs) extend large language models with explicit multi-step reasoning traces, but this capability introduces a new class of prompt-induced inference-time denial-of-service (PI-DoS) attacks that exploit the high computational cost of reasoning. We first formalize inference cost for LRMs and define PI-DoS, then prove that any practical PI-DoS attack should satisfy three properties: (1) a high amplification ratio, where each query induces a disproportionately long reasoning trace relative to its own length; (ii) stealthiness, in which prompts and responses remain on the natural language manifold and evade distribution shift detectors; and (iii) optimizability, in which the attack supports efficient optimization without being slowed by its own success. Under this framework, we present ReasoningBomb, a reinforcement-learning-based PI-DoS framework that is guided by a constant-time surrogate reward and trains a large reasoning-model attacker to generate short natural prompts that drive victim LRMs into pathologically long and often effectively non-terminating reasoning. Across seven open-source models (including LLMs and LRMs) and three commercial LRMs, ReasoningBomb induces 18,759 completion tokens on average and 19,263 reasoning tokens on average across reasoning models. It outperforms the the runner-up baseline by 35% in completion tokens and 38% in reasoning tokens, while inducing 6-7x more tokens than benign queries and achieving 286.7x input-to-output amplification ratio averaged across all samples. Additionally, our method achieves 99.8% bypass rate on input-based detection, 98.7% on output-based detection, and 98.4% against strict dual-stage joint detection.

Key Contributions

Formal characterization of PI-DoS attacks via three necessary properties: amplification ratio, stealthiness, and optimizability
ReasoningBomb: two-stage SFT + GRPO-based RL framework using a constant-time MLP surrogate reward (from victim hidden states) and a diversity reward to train a short-prompt attacker
Empirical demonstration across 10 LRMs achieving 18,759 avg completion tokens, 286.7x amplification, and >98% bypass rate against input-, output-, and dual-stage detectors

🛡️ Threat Analysis

Details

Domains

nlpreinforcement-learning

Model Types

llmtransformerrl

Threat Tags

grey_boxblack_boxinference_time

Applications

large reasoning model inference serversllm api services

Read PDF arXiv DOI

ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking

POT: Inducing Overthinking in LLMs via Black-Box Iterative Optimization

LoopLLM: Transferable Energy-Latency Attacks in LLMs via Repetitive Generation

RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress

Rethinking Latency Denial-of-Service: Attacking the LLM Serving Framework, Not the Model

Sponge Tool Attack: Stealthy Denial-of-Efficiency against Tool-Augmented Agentic Reasoning

Prompt-Induced Over-Generation as Denial-of-Service: A Black-Box Attack-Side Benchmark

RECUR: Resource Exhaustion Attack via Recursive-Entropy Guided Counterfactual Utilization and Reflection