Kavishvaran Srinivasan

h-index: 0 0 citations 1 papers (total)

Papers in Database (1)

defense arXiv Nov 24, 2025 · Nov 2025

Defending Large Language Models Against Jailbreak Exploits with Responsible AI Considerations

Ryan Wong, Hosea David Yu Fei Ng, Dhananjai Sharma et al. · National University of Singapore

Proposes three LLM jailbreak defenses — prompt sanitization, logit steering, and agent-based — with benchmark evaluation

Prompt Injection nlp
PDF Code