defense 2026

CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

Hanna Foerster ¹, Tom Blanchard ^2,3, Kristina Nikolić ⁴, Ilia Shumailov ⁵, Cheng Zhang ⁵, Robert Mullins ¹, Ilia Shumailov ^2,3, Florian Tramèr ⁴, Yiren Zhao ⁵

¹ University of Cambridge

² University of Toronto

³ Vector Institute

⁴ ETH Zurich

⁵ AI Security Company

1 citations · 44 references · arXiv

Published on arXiv

2601.09923

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Single-Shot Planning retains up to 57% of frontier model performance while improving smaller open-source model performance by up to 19%, demonstrating security and utility can coexist in CUAs.

Single-Shot Planning

Novel technique introduced

AI agents are vulnerable to prompt injection attacks, where malicious content hijacks agent behavior to steal credentials or cause financial loss. The only known robust defense is architectural isolation that strictly separates trusted task planning from untrusted environment observations. However, applying this design to Computer Use Agents (CUAs) -- systems that automate tasks by viewing screens and executing actions -- presents a fundamental challenge: current agents require continuous observation of UI state to determine each action, conflicting with the isolation required for security. We resolve this tension by demonstrating that UI workflows, while dynamic, are structurally predictable. We introduce Single-Shot Planning for CUAs, where a trusted planner generates a complete execution graph with conditional branches before any observation of potentially malicious content, providing provable control flow integrity guarantees against arbitrary instruction injections. Although this architectural isolation successfully prevents instruction injections, we show that additional measures are needed to prevent Branch Steering attacks, which manipulate UI elements to trigger unintended valid paths within the plan. We evaluate our design on OSWorld, and retain up to 57% of the performance of frontier models while improving performance for smaller open-source models by up to 19%, demonstrating that rigorous security and utility can coexist in CUAs.

Key Contributions

Single-Shot Planning: a trusted planner generates a complete conditional execution graph before any exposure to potentially malicious UI content, providing provable control flow integrity against arbitrary instruction injections
Branch Steering: a novel residual attack where adversaries manipulate UI elements to trigger unintended but syntactically valid branches within the pre-generated plan
Empirical evaluation on OSWorld showing the approach retains up to 57% of frontier model performance while improving smaller open-source models by up to 19%

🛡️ Threat Analysis

Details

Domains

nlpmultimodal

Model Types

llmvlm

Threat Tags

inference_timetargetedblack_box

Datasets

OSWorld

Applications

computer use agentsai task automationos-level agent systems

Read PDF arXiv DOI

CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Blind Gods and Broken Screens: Architecting a Secure, Intent-Centric Mobile Agent Operating System

When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents

Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems

Who Grants the Agent Power? Defending Against Instruction Injection via Task-Centric Access Control

Enhancing Reliability in LLM-Integrated Robotic Systems: A Unified Approach to Security and Safety

MirrorGuard: Toward Secure Computer-Use Agents via Simulation-to-Real Reasoning Correction

ScamAgents: How AI Agents Can Simulate Human-Level Scam Calls

ClawSafety: "Safe" LLMs, Unsafe Agents