Weikai Lin

Papers in Database (1)

defense arXiv Apr 18, 2026 · 4w ago

SafeDream: Safety World Model for Proactive Early Jailbreak Detection

Bo Yan, Weikai Lin, Yada Zhu et al. · University of Central Florida · University of Rochester +1 more

World-model-based early warning system that detects multi-turn jailbreak attacks 1+ turns before LLM compliance using safety state prediction

Prompt Injection nlp
PDF