defense 2025

A Safety and Security Framework for Real-World Agentic Systems

Shaona Ghosh 1, Barnaby Simkin 1, Kyriacos Shiarlis 2, Soumili Nandi 1, Dan Zhao 1, Matthew Fiedler 2, Julia Bazinska 2, Nikki Pope 1, Roopa Prabhu 1, Daniel Rohrer , Michael Demoret 1, Bartley Richardson 1

2 citations · 30 references · arXiv

α

Published on arXiv

2511.21990

Excessive Agency

OWASP LLM Top 10 — LLM08

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Framework identifies and contextually mitigates novel agentic risks in NVIDIA's AI-Q Research Assistant, validated through over 10,000 realistic attack and defense execution traces

Dynamic Agentic Safety and Security Framework

Novel technique introduced


This paper introduces a dynamic and actionable framework for securing agentic AI systems in enterprise deployment. We contend that safety and security are not merely fixed attributes of individual models but also emergent properties arising from the dynamic interactions among models, orchestrators, tools, and data within their operating environments. We propose a new way of identification of novel agentic risks through the lens of user safety. Although, for traditional LLMs and agentic models in isolation, safety and security has a clear separation, through the lens of safety in agentic systems, they appear to be connected. Building on this foundation, we define an operational agentic risk taxonomy that unifies traditional safety and security concerns with novel, uniquely agentic risks, including tool misuse, cascading action chains, and unintended control amplification among others. At the core of our approach is a dynamic agentic safety and security framework that operationalizes contextual agentic risk management by using auxiliary AI models and agents, with human oversight, to assist in contextual risk discovery, evaluation, and mitigation. We further address one of the most challenging aspects of safety and security of agentic systems: risk discovery through sandboxed, AI-driven red teaming. We demonstrate the framework effectiveness through a detailed case study of NVIDIA flagship agentic research assistant, AI-Q Research Assistant, showcasing practical, end-to-end safety and security evaluations in complex, enterprise-grade agentic workflows. This risk discovery phase finds novel agentic risks that are then contextually mitigated. We also release the dataset from our case study, containing traces of over 10,000 realistic attack and defense executions of the agentic workflow to help advance research in agentic safety.


Key Contributions

  • Operational agentic risk taxonomy unifying traditional LLM safety/security concerns with novel agentic risks: tool misuse, cascading action chains, and unintended control amplification
  • Dynamic agentic safety and security framework using auxiliary AI agents with human oversight for contextual risk discovery, evaluation, and mitigation across the agentic development lifecycle
  • Sandboxed AI-driven red teaming methodology validated on NVIDIA's AI-Q Research Assistant, releasing a dataset of 10,000+ realistic attack and defense execution traces

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_timetargeted
Datasets
Nemotron-AIQ-Agentic-Safety-Dataset-1.0
Applications
enterprise agentic ai systemsllm research assistants