benchmark arXiv Jan 14, 2026 · 11w ago
Greta Dolcetti, Giulio Zizzo, Sergio Maffeis · Ca’ Foscari University of Venice · IBM Research +1 more
Benchmarks prompt injection and tool poisoning attacks against four open-source function-calling LLMs alongside eight defenses, finding none production-ready
Prompt Injection Insecure Plugin Design nlp
We present an experimental evaluation that assesses the robustness of four open source LLMs claiming function-calling capabilities against three different attacks, and we measure the effectiveness of eight different defences. Our results show how these models are not safe by default, and how the defences are not yet employable in real-world scenarios.
llm Ca’ Foscari University of Venice · IBM Research · Imperial College London