Daniele Nardi

h-index: 5 110 citations 26 papers (total)

Papers in Database (1)

benchmark arXiv Oct 14, 2025 · Oct 2025

Guarding the Guardrails: A Taxonomy-Driven Approach to Jailbreak Detection

Francesco Giarrusso, Olga E. Sorokoletova, Vincenzo Suriani et al. · Sapienza University of Rome

Proposes a 7-family jailbreak taxonomy, Italian multi-turn dataset, and GPT-5 detection benchmark for LLM safety

Prompt Injection nlp
2 citations PDF