Rishika Bhagwatkar

defense arXiv Oct 6, 2025 · Oct 2025

Rishika Bhagwatkar, Kevin Kasa, Abhay Puri et al. · ServiceNow Research · Mila - Québec AI Institute +3 more

Modular agent-tool firewall achieves perfect indirect prompt injection defense on four benchmarks, while exposing those benchmarks as too weak

Prompt Injection nlp

4 citations PDF

defense arXiv Feb 20, 2026 · 6w ago

Rishika Bhagwatkar, Irina Rish, Nicolas Flammarion et al. · Mila - Québec AI Institute · EPFL +1 more

Attacks discrete image tokenizers with adversarial perturbations and defends via unsupervised adversarial training across multimodal tasks

Input Manipulation Attack visionmultimodal

Papers in Database (2)