benchmark 2026

Indirect Prompt Injection in the Wild: An Empirical Study of Prevalence, Techniques, and Objectives

Soheil Khodayari ¹, Xuenan Zhang ², Bhupendra Acharya ³, Giancarlo Pellegrino ²

¹ Independent Researcher

² CISPA Helmholtz Center for Information Security

³ University of Louisiana

0 citations

Published on arXiv

2604.27202

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Identifies 15.3K real-world indirect prompt injections across 1.2B URLs, with 70% hidden in non-rendered HTML; compliance reaches up to 8% for smaller models on plain-text inputs

As LLMs are increasingly integrated into systems that browse, retrieve, summarize, and act on web content, webpages have become an untrusted input vector for downstream model behavior. This enables site owners, contributors, and adversaries to embed instructions directly in web resources, i.e., indirect prompt injections. While prior work demonstrates such attacks in controlled settings, their prevalence, deployment, and real-world impact remain unclear. We present one of the first large-scale empirical analyses of indirect prompt injections in webpages and HTTP responses. Analyzing 1.2B URLs from 24.8M hosts, we identify 15.3K validated instances across 11.7K pages. These are not isolated cases: a small number of recurring templates account for most cases. We characterize their objectives, delivery mechanisms, visibility, persistence, and impact, revealing a heterogeneous ecosystem spanning disruptive prompts, reputation manipulation, content-protection directives, and AI-bot detection, targeting systems such as crawlers, search pipelines, customer-support agents, and hiring workflows. A key finding is that most instructions target machines rather than humans: about 70% appear in non-rendered HTML (e.g., headers, comments, metadata), and many visible cases are hidden via rendering techniques. To assess practical risk, we run 5,200 controlled experiments across 13 models and four webpage representations. Our results show compliance is limited but non-negligible, reaching up to 8% for smaller models on plain-text inputs, while structured representations reduce compliance by preserving structural cues. Overall, prompt-based interference is already present in the web ecosystem and represents a growing source of tension between LLM-driven automation and the sites it consumes.

Key Contributions

First large-scale empirical measurement of indirect prompt injections in the wild, analyzing 1.2B URLs and identifying 15.3K validated instances across 11.7K pages
Characterization of attack delivery mechanisms (70% in non-rendered HTML like headers, comments, metadata), objectives (reputation manipulation, content protection, AI-bot detection), and target systems (crawlers, RAG, customer support agents)
Controlled evaluation of 5,200 experiments across 13 LLM models showing up to 8% compliance rates for smaller models on plain-text inputs, with structured representations reducing compliance

🛡️ Threat Analysis

Details

Domains

nlpmultimodal

Model Types

llm

Threat Tags

inference_timeblack_box

Datasets

1.2B URLs from 24.8M hosts15.3K validated indirect prompt injection instances

Applications

web crawlerssearch pipelinesrag systemscustomer support agentshiring workflowsllm-integrated automation

Read PDF arXiv

Indirect Prompt Injection in the Wild: An Empirical Study of Prevalence, Techniques, and Objectives

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Red Teaming Multimodal Language Models: Evaluating Harm Across Prompt Modalities and Models

Alignment Drift in Multimodal LLMs: A Two-Phase, Longitudinal Evaluation of Harm Across Eight Model Releases

SafeMT: Multi-turn Safety for Multimodal Language Models

Lingua-SafetyBench: A Benchmark for Safety Evaluation of Multilingual Vision-Language Models

Countermind: A Multi-Layered Security Architecture for Large Language Models

AegisAgent: An Autonomous Defense Agent Against Prompt Injection Attacks in LLM-HARs

Can AI Models be Jailbroken to Phish Elderly Victims? An End-to-End Evaluation

Breaking Guardrails, Facing Walls: Insights on Adversarial AI for Defenders & Researchers