ML Security Papers

Benchmarks prompt injection and tool poisoning attacks against four open-source function-calling LLMs alongside eight defenses, finding none production-ready

defense arXiv Oct 10, 2025 · Oct 2025

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Yue Huang, Hang Hua, Yujun Zhou et al. · University of Notre Dame · MIT-IBM Watson AI Lab +3 more

Proposes Safiron, a pre-execution guardrail that detects, categorizes, and explains risky LLM agent action plans before they execute

Excessive Agency nlp

5 citations 1 influentialPDF

Latest papers

Blue Teaming Function-Calling Agents

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue