defense 2026

Prompt Injection Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching

Diego Gosmar ^1,2, Deborah A. Dahl ^3,2

¹ Tesisquare

² Linux Foundation

³ Conversational Technologies

0 citations · 26 references · arXiv

Published on arXiv

2601.13186

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Achieves zero high-risk injection breaches across 301 prompts while reducing LLM calls by 41.6% via semantic caching, with the ExtremeObservability TIVS-O configuration yielding the best combined security and transparency score.

TIVS-O / Nested Learning

Novel technique introduced

Prompt injection remains a central obstacle to the safe deployment of large language models, particularly in multi-agent settings where intermediate outputs can propagate or amplify malicious instructions. Building on earlier work that introduced a four-metric Total Injection Vulnerability Score (TIVS), this paper extends the evaluation framework with semantic similarity-based caching and a fifth metric (Observability Score Ratio) to yield TIVS-O, investigating how defence effectiveness interacts with transparency in a HOPE-inspired Nested Learning architecture. The proposed system combines an agentic pipeline with Continuum Memory Systems that implement semantic similarity-based caching across 301 synthetically generated injection-focused prompts drawn from ten attack families, while a fourth agent performs comprehensive security analysis using five key performance indicators. In addition to traditional injection metrics, OSR quantifies the richness and clarity of security-relevant reasoning exposed by each agent, enabling an explicit analysis of trade-offs between strict mitigation and auditability. Experiments show that the system achieves secure responses with zero high-risk breaches, while semantic caching delivers substantial computational savings, achieving a 41.6% reduction in LLM calls and corresponding decreases in latency, energy consumption, and carbon emissions. Five TIVS-O configurations reveal optimal trade-offs between mitigation strictness and forensic transparency. These results indicate that observability-aware evaluation can reveal non-monotonic effects within multi-agent pipelines and that memory-augmented agents can jointly maximize security robustness, real-time performance, operational cost savings, and environmental sustainability without modifying underlying model weights, providing a production-ready pathway for secure and green LLM deployments.

Key Contributions

TIVS-O evaluation framework extending the four-metric TIVS with an Observability Score Ratio (OSR) to quantify auditability vs. mitigation trade-offs in multi-agent pipelines
Nested Learning architecture with Continuum Memory Systems implementing semantic similarity-based caching across a three-stage agentic pipeline (generator, guard-sanitizer, policy enforcer) without modifying model weights
Demonstrated 41.6% reduction in LLM calls via semantic caching, achieving zero high-risk breaches across 301 synthetic injection prompts from ten attack families with measurable energy and carbon savings

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_timeblack_box

Datasets

301 synthetically generated injection-focused prompts (10 attack families, internal dataset)

Applications

multi-agent llm systemsllm deployment securityagentic ai pipelines

Read PDF arXiv DOI

Prompt Injection Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

HoneyTrap: Deceiving Large Language Model Attackers to Honeypot Traps with Resilient Multi-Agent Defense

RvB: Automating AI System Hardening via Iterative Red-Blue Games

Prompt Fencing: A Cryptographic Approach to Establishing Security Boundaries in Large Language Model Prompts

A Multi-Agent LLM Defense Pipeline Against Prompt Injection Attacks

Prompt Attack Detection with LLM-as-a-Judge and Mixture-of-Models

Better Privilege Separation for Agents by Restricting Data Types

Active Honeypot Guardrail System: Probing and Confirming Multi-Turn LLM Jailbreaks

Cross-Service Threat Intelligence in LLM Services using Privacy-Preserving Fingerprints