benchmark 2025

Exploiting Web Search Tools of AI Agents for Data Exfiltration

Dennis Rall 1, Bernhard Bauer 2, Mohit Mittal 1, Thomas Fraunholz 1

0 citations · 25 references · arXiv

α

Published on arXiv

2510.09093

Prompt Injection

OWASP LLM Top 10 — LLM01

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

Even well-known indirect prompt injection patterns continue to succeed against current LLMs integrated with web search tools, demonstrating persistent deficiencies in model defenses regardless of model size or manufacturer.


Large language models (LLMs) are now routinely used to autonomously execute complex tasks, from natural language processing to dynamic workflows like web searches. The usage of tool-calling and Retrieval Augmented Generation (RAG) allows LLMs to process and retrieve sensitive corporate data, amplifying both their functionality and vulnerability to abuse. As LLMs increasingly interact with external data sources, indirect prompt injection emerges as a critical and evolving attack vector, enabling adversaries to exploit models through manipulated inputs. Through a systematic evaluation of indirect prompt injection attacks across diverse models, we analyze how susceptible current LLMs are to such attacks, which parameters, including model size and manufacturer, specific implementations, shape their vulnerability, and which attack methods remain most effective. Our results reveal that even well-known attack patterns continue to succeed, exposing persistent weaknesses in model defenses. To address these vulnerabilities, we emphasize the need for strengthened training procedures to enhance inherent resilience, a centralized database of known attack vectors to enable proactive defense, and a unified testing framework to ensure continuous security validation. These steps are essential to push developers toward integrating security into the core design of LLMs, as our findings show that current models still fail to mitigate long-standing threats.


Key Contributions

  • Systematic evaluation of indirect prompt injection attack susceptibility across multiple LLM models, comparing model size and manufacturer-specific safeguards
  • Realistic attack scenario demonstrating data exfiltration via obfuscated indirect prompt injection in RAG-based web-search agents
  • Identification that well-documented attack patterns remain persistently effective and recommendations for centralized attack vector databases and unified testing frameworks

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_timetargeted
Applications
llm agentsrag systemsenterprise knowledge basesweb search tool-calling