attack 2026

Omission Constraints Decay While Commission Constraints Persist in Long-Context LLM Agents

Yeran Gamage 1,2

0 citations

α

Published on arXiv

2604.20911

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Omission compliance falls from 73% at turn 5 to 33% at turn 16 while commission compliance holds at 100% (Mistral Large 3, p < 10^-33); schema semantic content accounts for 62-100% of dilution effect

Context Extension Injection (CEI)

Novel technique introduced


LLM agents deployed in production operate under operator-defined behavioral policies (system-prompt instructions such as prohibitions on credential disclosure, data exfiltration, and unauthorized output) that safety evaluations assume hold throughout a conversation. Prohibition-type constraints decay under context pressure while requirement-type constraints persist; we term this asymmetry Security-Recall Divergence (SRD). In a 4,416-trial three-arm causal study across 12 models and 8 providers at six conversation depths, omission compliance falls from 73% at turn 5 to 33% at turn 16 while commission compliance holds at 100% (Mistral Large 3, $p < 10^{-33}$). In the two models with token-matched padding controls, schema semantic content accounts for 62-100% of the dilution effect. Re-injecting constraints before the per-model Safe Turn Depth (STD) restores compliance without retraining. Production security policies consist of prohibitions such as never revealing credentials, never executing untrusted code, and never forwarding user data. Commission-type audit signals remain healthy while omission constraints have already failed, leaving the failure invisible to standard monitoring.


Key Contributions

  • Discovers Security-Recall Divergence (SRD): prohibition constraints decay 40% (73% → 33%) while commission constraints hold at 100% as context depth increases
  • Proposes Context Extension Injection (CEI) attack via MCP schema flooding that violates security policies without adversarial prompts
  • Identifies per-model Safe Turn Depth (STD) thresholds and shows constraint re-injection restores compliance without retraining

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_timeuntargeted
Datasets
Custom 4,416-trial three-arm causal study across 12 models
Applications
llm agentsdevops automationenterprise workflowsmulti-turn conversational systems