Ken Huang

benchmark arXiv Feb 25, 2026 · 12w ago

Sarthak Munshi, Manish Bhatt, Vineeth Sai Narajala et al. · Amazon · Cisco +2 more

Maps LLM safety failure topology using quality-diversity optimization to reveal behavioral attraction basins across three frontier models

Prompt Injection nlp

defense arXiv Apr 7, 2026 · 6w ago

Manish Bhatt, Sarthak Munshi, Vineeth Sai Narajala et al. · OWASP · Amazon Web Services +3 more

Proves continuous utility-preserving prompt filters cannot eliminate all LLM jailbreaks due to topological constraints on prompt space

Prompt Injection nlp

Papers in Database (2)