tool 2025

DREAM: Dynamic Red-teaming across Environments for AI Models

0 citations · 65 references · arXiv (Cornell University)

Published on arXiv

2512.19016

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Over 70% of dynamically constructed multi-stage attack chains successfully bypass existing defences across 12 leading LLM agents, and traditional mitigations such as initial defence prompts are largely ineffective

DREAM (CE-AKG + C-GPS)

Novel technique introduced

Large Language Models (LLMs) are increasingly used in agentic systems, where their interactions with diverse tools and environments create complex, multi-stage safety challenges. However, existing benchmarks mostly rely on static, single-turn assessments that miss vulnerabilities from adaptive, long-chain attacks. To fill this gap, we introduce DREAM, a framework for systematic evaluation of LLM agents against dynamic, multi-stage attacks. At its core, DREAM uses a Cross-Environment Adversarial Knowledge Graph (CE-AKG) to maintain stateful, cross-domain understanding of vulnerabilities. This graph guides a Contextualized Guided Policy Search (C-GPS) algorithm that dynamically constructs attack chains from a knowledge base of 1,986 atomic actions across 349 distinct digital environments. Our evaluation of 12 leading LLM agents reveals a critical vulnerability: these attack chains succeed in over 70% of cases for most models, showing the power of stateful, cross-environment exploits. Through analysis of these failures, we identify two key weaknesses in current agents: contextual fragility, where safety behaviors fail to transfer across environments, and an inability to track long-term malicious intent. Our findings also show that traditional safety measures, such as initial defense prompts, are largely ineffective against attacks that build context over multiple interactions. To advance agent safety research, we release DREAM as a tool for evaluating vulnerabilities and developing more robust defenses.

Key Contributions

Cross-Environment Adversarial Knowledge Graph (CE-AKG) that formalises multi-stage exploits by treating single-turn attacks as atomic actions composable across 349 digital environments
Contextualized Guided Policy Search (C-GPS) algorithm that dynamically constructs long-chain attack trajectories from a knowledge base of 1,986 atomic actions
Evaluation of 12 leading LLM agents revealing >70% attack success and identifying contextual fragility and inability to track long-term malicious intent as core structural weaknesses

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_time

Applications

llm agentsagentic ai systemsmulti-environment ai workflows

Read PDF arXiv DOI

DREAM: Dynamic Red-teaming across Environments for AI Models

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

AJAR: Adaptive Jailbreak Architecture for Red-teaming

ClawTrap: A MITM-Based Red-Teaming Framework for Real-World OpenClaw Security Evaluation

AgentSight: System-Level Observability for AI Agents Using eBPF

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

"Your AI, My Shell": Demystifying Prompt Injection Attacks on Agentic AI Coding Editors

LAAF: Logic-layer Automated Attack Framework A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems

Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents

Temporal Attack Pattern Detection in Multi-Agent AI Workflows: An Open Framework for Training Trace-Based Security Models