defense 2025

FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents

Imene Kerboua ^1,2, Sahar Omidi Shayegan ^3,4, Megh Thakkar ³, Xing Han Lù ^3,4, Léo Boisvert ^3,5, Massimo Caccia ³, Jérémy Espinas ², Alexandre Aussem ¹, Véronique Eglin ¹, Alexandre Lacoste ³

¹ LIRIS - CNRS

² Esker

³ Mila - Quebec AI Institute

⁴ McGill University

⁵ Polytechnique Montréal

2 citations · 29 references · arXiv

Published on arXiv

2510.03204

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

FocusAgent reduces web page observation size by over 50% while matching baseline task performance and significantly reducing success rates of banner and pop-up prompt injection attacks.

FocusAgent

Novel technique introduced

Web agents powered by large language models (LLMs) must process lengthy web page observations to complete user goals; these pages often exceed tens of thousands of tokens. This saturates context limits and increases computational cost processing; moreover, processing full pages exposes agents to security risks such as prompt injection. Existing pruning strategies either discard relevant content or retain irrelevant context, leading to suboptimal action prediction. We introduce FocusAgent, a simple yet effective approach that leverages a lightweight LLM retriever to extract the most relevant lines from accessibility tree (AxTree) observations, guided by task goals. By pruning noisy and irrelevant content, FocusAgent enables efficient reasoning while reducing vulnerability to injection attacks. Experiments on WorkArena and WebArena benchmarks show that FocusAgent matches the performance of strong baselines, while reducing observation size by over 50%. Furthermore, a variant of FocusAgent significantly reduces the success rate of prompt-injection attacks, including banner and pop-up attacks, while maintaining task success performance in attack-free settings. Our results highlight that targeted LLM-based retrieval is a practical and robust strategy for building web agents that are efficient, effective, and secure.

Key Contributions

FocusAgent: a lightweight LLM retriever that extracts task-relevant lines from accessibility tree observations, reducing observation size by over 50% while matching strong baselines on WorkArena and WebArena
Demonstrates that targeted context pruning provides an emergent defense against indirect prompt injection attacks (banner and pop-up attacks) in web agent pipelines
A security-focused FocusAgent variant that significantly reduces prompt injection attack success rates while preserving task performance in attack-free settings

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

inference_timeblack_box

Datasets

WorkArenaWebArena

Applications

web agentsllm-based web browsing automation

Read PDF arXiv DOI

FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Beyond Surface-Level Detection: Towards Cognitive-Driven Defense Against Jailbreak Attacks via Meta-Operations Reasoning

TRYLOCK: Defense-in-Depth Against LLM Jailbreaks via Layered Preference and Representation Engineering

Securing AI Agents Against Prompt Injection Attacks

Adversarial Distilled Retrieval-Augmented Guarding Model for Online Malicious Intent Detection

From static to adaptive: immune memory-based jailbreak detection for large language models

Efficient and Adaptable Detection of Malicious LLM Prompts via Bootstrap Aggregation

Think Twice, Generate Once: Safeguarding by Progressive Self-Reflection

PIShield: Detecting Prompt Injection Attacks via Intrinsic LLM Features