defense 2026

Atomicity for Agents: Exposing, Exploiting, and Mitigating TOCTOU Vulnerabilities in Browser-Use Agents

Linxi Jiang , Zhijie Liu , Haotian Luo , Zhiqiang Lin

0 citations

α

Published on arXiv

2603.00476

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

TOCTOU vulnerabilities are widespread across all 10 evaluated browser-use agents, and pre-execution DOM/layout validation significantly reduces unintended action execution caused by adversarial or dynamic page changes

Pre-execution DOM validation

Novel technique introduced


Browser-use agents are widely used for everyday tasks. They enable automated interaction with web pages through structured DOM based interfaces or vision language models operating on page screenshots. However, web pages often change between planning and execution, causing agents to execute actions based on stale assumptions. We view this temporal mismatch as a time of check to time of use (TOCTOU) vulnerability in browser-use agents. Dynamic or adversarial web content can exploit this window to induce unintended actions. We present a large scale empirical study of TOCTOU vulnerabilities in browser-use agents using a benchmark that spans synthesized and real world websites. Using this benchmark, we evaluate 10 popular open source agents and show that TOCTOU vulnerabilities are widespread. We design a lightweight mitigation based on pre-execution validation. It monitors DOM and layout changes during planning and validates the page state immediately before action execution. This approach reduces the risk of insecure execution and mitigates unintended side effects in browser-use agents.


Key Contributions

  • First systematic large-scale empirical study of TOCTOU vulnerabilities in browser-use agents, evaluating 10 popular open-source agents on synthesized and real-world websites
  • Demonstration that adversarial and dynamic web content can exploit the planning-execution timing gap to induce unintended agent actions at scale
  • Lightweight pre-execution validation mitigation that monitors DOM and layout changes between planning and action execution to prevent stale-state exploitation

🛡️ Threat Analysis


Details

Domains
nlpvisionmultimodal
Model Types
llmvlm
Threat Tags
inference_timedigital
Datasets
synthesized web benchmarkreal-world websites
Applications
browser automationweb agentsagentic ai systems