Wei Jie Yeo

Papers in Database (1)

defense arXiv Aug 16, 2025 · Aug 2025

Mitigating Jailbreaks with Intent-Aware LLMs

Wei Jie Yeo, Ranjan Satapathy, Erik Cambria · Nanyang Technological University · A*STAR

Fine-tunes LLMs to infer instruction intent before responding, reducing all jailbreak attack categories below 50% success rate

Prompt Injection nlp
PDF Code