defense 2025

Throttling Web Agents Using Reasoning Gates

Abhinav Kumar , Jaechul Roh , Ali Naseh , Amir Houmansadr , Eugene Bagdasarian

0 citations

α

Published on arXiv

2509.01619

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Reasoning gates achieve 9.2x computational asymmetry — response-generation cost for SOTA agent models is 9.2x higher than the server-side gate-generation cost.

Reasoning Gates

Novel technique introduced


AI web agents use Internet resources at far greater speed, scale, and complexity -- changing how users and services interact. Deployed maliciously or erroneously, these agents could overload content providers. At the same time, web agents can bypass CAPTCHAs and other defenses by mimicking user behavior or flood authentication systems with fake accounts. Yet providers must protect their services and content from denial-of-service attacks and scraping by web agents. In this paper, we design a framework that imposes tunable costs on agents before providing access to resources; we call this Web Agent Throttling. We start by formalizing Throttling Gates as challenges issued to an agent that are asymmetric, scalable, robust, and compatible with any agent. Focusing on a common component -- the language model -- we require the agent to solve reasoning puzzles, thereby incurring excessive token-generation costs. However, we find that using existing puzzles, e.g., coding or math, as throttling gates fails to satisfy our properties. To address this, we introduce rebus-based Reasoning Gates, synthetic text puzzles that require multi-hop reasoning over world knowledge (thereby throttling an agent's model). We design a scalable generation and verification protocol for such reasoning gates. Our framework achieves computational asymmetry, i.e., the response-generation cost is 9.2x higher than the generation cost for SOTA models. We further deploy reasoning gates on a custom website and Model Context Protocol (MCP) servers and evaluate with real-world web agents. Finally, we discuss the limitations and environmental impact of real-world deployment of our framework.


Key Contributions

  • Formalization of the Web Agent Throttling problem with four properties: computational asymmetry, scalability, robustness, and agent-compatibility
  • Rebus-based Reasoning Gates — synthetic multi-hop reasoning puzzles that impose 9.2x higher token-generation cost on agents than the gate-generation cost for servers
  • End-to-end deployment on custom websites and MCP servers with real-world web agent evaluation

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_time
Applications
web servicesmcp serverscontent providersauthentication systems