attack 2025

POT: Inducing Overthinking in LLMs via Black-Box Iterative Optimization

Xinyu Li 1, Tianjin Huang 1, Ronghui Mu 1, Xiaowei Huang 2, Gaojie Jin 1

0 citations

α

Published on arXiv

2508.19277

Model Denial of Service

OWASP LLM Top 10 — LLM04

Key Finding

POT achieves superior reasoning token inflation performance compared to retrieval-dependent baseline methods while maintaining semantic naturalness and evading detection across diverse LLM architectures.

POT (Prompt-Only OverThinking)

Novel technique introduced


Recent advances in Chain-of-Thought (CoT) prompting have substantially enhanced the reasoning capabilities of large language models (LLMs), enabling sophisticated problem-solving through explicit multi-step reasoning traces. However, these enhanced reasoning processes introduce novel attack surfaces, particularly vulnerabilities to computational inefficiency through unnecessarily verbose reasoning chains that consume excessive resources without corresponding performance gains. Prior overthinking attacks typically require restrictive conditions including access to external knowledge sources for data poisoning, reliance on retrievable poisoned content, and structurally obvious templates that limit practical applicability in real-world scenarios. To address these limitations, we propose POT (Prompt-Only OverThinking), a novel black-box attack framework that employs LLM-based iterative optimization to generate covert and semantically natural adversarial prompts, eliminating dependence on external data access and model retrieval. Extensive experiments across diverse model architectures and datasets demonstrate that POT achieves superior performance compared to other methods.


Key Contributions

  • POT: a prompt-only black-box attack that iteratively optimizes semantically natural adversarial prompts to maximize reasoning token inflation without access to external knowledge bases or model internals
  • LLM-based iterative optimization with diversity-aware filtering to refine adversarial prompt elements, achieving higher stealth and cross-model transferability than prior retrieval-dependent overthinking attacks
  • Empirical evaluation across local (DeepSeek-R1) and commercial (GPT-4o) models demonstrating superior token inflation, prompt naturalness, and detection evasion compared to existing baselines

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
black_boxinference_time
Datasets
GSM8Kmathematical QA benchmarks
Applications
llm reasoning systemschain-of-thought inferenceapi-based llm services