Krishnamurthy Dj Dvijotham

defense arXiv Oct 6, 2025 · Oct 2025

Rishika Bhagwatkar, Kevin Kasa, Abhay Puri et al. · ServiceNow Research · Mila - Québec AI Institute +3 more

Modular agent-tool firewall achieves perfect indirect prompt injection defense on four benchmarks, while exposing those benchmarks as too weak

Prompt Injection nlp

4 citations PDF

attack arXiv Oct 3, 2025 · Oct 2025

Léo Boisvert, Abhay Puri, Chandra Kiran Reddy Evuru et al. · ServiceNow Research · Mila - Québec AI Institute +2 more

Backdoors injected via AI supply chain poisoning cause agents to leak confidential data with 80%+ success at 2% poison rate

Model Poisoning AI Supply Chain Attacks nlp

2 citations PDF

defense arXiv Feb 8, 2026 · 8w ago

Minbeom Kim, Mihir Parmar, Phillip Wallis et al. · Google Cloud AI Research · Seoul National University +2 more

Defends LLM tool-calling agents against indirect prompt injection via causal attribution-based dominance shift detection at privileged action points

Prompt Injection Excessive Agency nlp

Papers in Database (3)