Nils Philipp Walter

h-index: 3 25 citations 11 papers (total)

Papers in Database (2)

benchmark arXiv Oct 16, 2025 · Oct 2025

When Flatness Does (Not) Guarantee Adversarial Robustness

Nils Philipp Walter, Linara Adilova, Jilles Vreeken et al. · CISPA Helmholtz Center for Information Security · Ruhr University Bochum +3 more

Formally proves loss landscape flatness guarantees only local adversarial robustness; adversarial examples inhabit flat, confidently-wrong regions

Input Manipulation Attack vision
3 citations PDF
defense arXiv Oct 24, 2025 · Oct 2025

Soft Instruction De-escalation Defense

Nils Philipp Walter, Chawin Sitawarin, Jamie Hayes et al. · CISPA Helmholtz Center for Information Security · Google DeepMind +1 more

Defends LLM agents against indirect prompt injection via iterative sanitization, limiting adversarial attack success rate to 15%

Prompt Injection nlp
2 citations PDF