Chenhang Cui

h-index: 2 19 citations 7 papers (total)

Papers in Database (2)

defense arXiv Jan 31, 2026 · 9w ago

Self-Guard: Defending Large Reasoning Models via enhanced self-reflection

Jingnan Zheng, Jingjun Xu, Yanzhen Luo et al. · National University of Singapore · Southern University of Science and Technology +2 more

Defends Large Reasoning Models from jailbreaks by steering hidden-state activations to enforce safety compliance over sycophancy

Prompt Injection nlp
PDF Code
benchmark arXiv Jan 30, 2026 · 9w ago

Lingua-SafetyBench: A Benchmark for Safety Evaluation of Multilingual Vision-Language Models

Enyi Shi, Pengyang Shao, Yanxin Zhang et al. · Nanjing University of Science and Technology · National University of Singapore +3 more

Multilingual multimodal safety benchmark revealing cross-lingual asymmetries in VLLM jailbreak susceptibility across 10 languages and 11 models

Prompt Injection multimodalnlp
PDF Code