SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents
Zonghao Ying 1,2, Yangguang Shao 1,2, Jianle Gan 1,3, Gan Xu 2, Junjie Shen 4,2, Wenxin Zhang 5, Quanchen Zou 2, Junzheng Shi 6,2, Zhenfei Yin 2,7, Mingchuan Zhang 8, Aishan Liu 1, Xianglong Liu 1,9
3 China University of Petroleum (East China)
4 Zhejiang University of Technology
5 University of Chinese Academy of Sciences
Published on arXiv
2510.10073
Prompt Injection
OWASP LLM Top 10 — LLM01
Excessive Agency
OWASP LLM Top 10 — LLM08
Key Finding
All 9 tested LVLMs (general-purpose, agent-specialized, and GUI-grounded) are consistently vulnerable to subtle adversarial manipulations, revealing critical trade-offs between model specialization and security robustness
SecureWebArena
Novel technique introduced
Large vision-language model (LVLM)-based web agents are emerging as powerful tools for automating complex online tasks. However, when deployed in real-world environments, they face serious security risks, motivating the design of security evaluation benchmarks. Existing benchmarks provide only partial coverage, typically restricted to narrow scenarios such as user-level prompt manipulation, and thus fail to capture the broad range of agent vulnerabilities. To address this gap, we present \tool{}, the first holistic benchmark for evaluating the security of LVLM-based web agents. \tool{} first introduces a unified evaluation suite comprising six simulated but realistic web environments (\eg, e-commerce platforms, community forums) and includes 2,970 high-quality trajectories spanning diverse tasks and attack settings. The suite defines a structured taxonomy of six attack vectors spanning both user-level and environment-level manipulations. In addition, we introduce a multi-layered evaluation protocol that analyzes agent failures across three critical dimensions: internal reasoning, behavioral trajectory, and task outcome, facilitating a fine-grained risk analysis that goes far beyond simple success metrics. Using this benchmark, we conduct large-scale experiments on 9 representative LVLMs, which fall into three categories: general-purpose, agent-specialized, and GUI-grounded. Our results show that all tested agents are consistently vulnerable to subtle adversarial manipulations and reveal critical trade-offs between model specialization and security. By providing (1) a comprehensive benchmark suite with diverse environments and a multi-layered evaluation pipeline, and (2) empirical insights into the security challenges of modern LVLM-based web agents, \tool{} establishes a foundation for advancing trustworthy web agent deployment.
Key Contributions
- First holistic security benchmark for LVLM-based web agents with six simulated realistic environments and 2,970 annotated trajectories spanning diverse tasks and attack settings
- Structured taxonomy of six attack vectors covering both user-level and environment-level manipulations (including prompt injection and pop-up attacks)
- Multi-layered evaluation protocol analyzing agent failures across internal reasoning, behavioral trajectory, and task outcome dimensions — enabling fine-grained risk analysis beyond binary success metrics