Joey Chua

defense arXiv Sep 21, 2025 · Sep 2025

Rui Yang, Michael Fu, Chakkrit Tantithamthavorn et al. · Monash University · The University of Melbourne +1 more

Adaptive LLM guardrail using OOD detection and continual learning to defend against novel jailbreak attacks post-deployment

Prompt Injection nlp

defense arXiv Sep 21, 2025 · Sep 2025

Rui Yang, Michael Fu, Chakkrit Tantithamthavorn et al. · Monash University · The University of Melbourne +1 more

Defends LLM guardrails against obfuscation- and template-based jailbreaks using a deciphering layer and LoRA fine-tuning

Prompt Injection nlp

Papers in Database (2)