attack 2025

ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking

Yunzhe Li ¹, Jianan Wang ¹, Hongzi Zhu ¹, James Lin ¹, Shan Chang ², Minyi Guo ¹

¹ Shanghai Jiao Tong University

² Donghua University

7 citations · 1 influential · 43 references · arXiv

Published on arXiv

2512.07086

Model Denial of Service

OWASP LLM Top 10 — LLM04

Key Finding

Under strict 10 RPM rate limits, ThinkTrap degrades commercial LLM service throughput to as low as 1% of original capacity and can induce complete service failure.

ThinkTrap

Novel technique introduced

Large Language Models (LLMs) have become foundational components in a wide range of applications, including natural language understanding and generation, embodied intelligence, and scientific discovery. As their computational requirements continue to grow, these models are increasingly deployed as cloud-based services, allowing users to access powerful LLMs via the Internet. However, this deployment model introduces a new class of threat: denial-of-service (DoS) attacks via unbounded reasoning, where adversaries craft specially designed inputs that cause the model to enter excessively long or infinite generation loops. These attacks can exhaust backend compute resources, degrading or denying service to legitimate users. To mitigate such risks, many LLM providers adopt a closed-source, black-box setting to obscure model internals. In this paper, we propose ThinkTrap, a novel input-space optimization framework for DoS attacks against LLM services even in black-box environments. The core idea of ThinkTrap is to first map discrete tokens into a continuous embedding space, then undertake efficient black-box optimization in a low-dimensional subspace exploiting input sparsity. The goal of this optimization is to identify adversarial prompts that induce extended or non-terminating generation across several state-of-the-art LLMs, achieving DoS with minimal token overhead. We evaluate the proposed attack across multiple commercial, closed-source LLM services. Our results demonstrate that, even far under the restrictive request frequency limits commonly enforced by these platforms, typically capped at ten requests per minute (10 RPM), the attack can degrade service throughput to as low as 1% of its original capacity, and in some cases, induce complete service failure.

Key Contributions

ThinkTrap: a black-box input-space optimization framework that maps discrete tokens into a continuous embedding space and optimizes adversarial prompts in a low-dimensional subspace exploiting input sparsity
Demonstrates that DoS attacks via unbounded reasoning are feasible against closed-source commercial LLM services without internal model access (no logits, no gradients)
Empirically shows throughput degradation to as low as 1% of original capacity, and complete service failure in some cases, even under strict 10 RPM rate limits

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

black_boxinference_time

Applications

llm cloud servicescommercial llm apis

Read PDF arXiv DOI

ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

POT: Inducing Overthinking in LLMs via Black-Box Iterative Optimization

LoopLLM: Transferable Energy-Latency Attacks in LLMs via Repetitive Generation

RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress

Rethinking Latency Denial-of-Service: Attacking the LLM Serving Framework, Not the Model

Sponge Tool Attack: Stealthy Denial-of-Efficiency against Tool-Augmented Agentic Reasoning

SHIELD: An Auto-Healing Agentic Defense Framework for LLM Resource Exhaustion Attacks

ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models

Overthinking Loops in Agents: A Structural Risk via MCP Tools