Latest papers

1 papers
benchmark arXiv Nov 1, 2025 · Nov 2025

Do Methods to Jailbreak and Defend LLMs Generalize Across Languages?

Berk Atil, Rebecca J. Passonneau, Fred Morstatter · Penn State University · Information Sciences Institute

Benchmarks multilingual jailbreak attacks and defenses across ten languages and six LLMs, finding language-dependent safety gaps

Prompt Injection nlp
1 citations PDF