Gelei Deng

h-index: 6 58 citations 18 papers (total)

Papers in Database (3)

attack arXiv Oct 9, 2025 · Oct 2025

When Search Goes Wrong: Red-Teaming Web-Augmented Large Language Models

Haoran Ou, Kangjie Chen, Xingshuo Han et al. · Nanyang Technological University · Nanjing University of Aeronautics and Astronautics +2 more

Red-teams web-augmented LLMs with benign-looking search queries that bypass safety filters and force harmful content citations

Prompt Injection nlp
1 citations PDF
attack arXiv Jan 31, 2026 · 9w ago

DECEIVE-AFC: Adversarial Claim Attacks against Search-Enabled LLM-based Fact-Checking Systems

Haoran Ou, Kangjie Chen, Gelei Deng et al. · Nanyang Technological University · A*STAR

Agent-based adversarial claim attacks on search-augmented LLM fact-checkers disrupt retrieval and reasoning, dropping accuracy from 78.7% to 53.7%

Prompt Injection nlp
PDF
defense arXiv Jan 31, 2026 · 9w ago

Self-Guard: Defending Large Reasoning Models via enhanced self-reflection

Jingnan Zheng, Jingjun Xu, Yanzhen Luo et al. · National University of Singapore · Southern University of Science and Technology +2 more

Defends Large Reasoning Models from jailbreaks by steering hidden-state activations to enforce safety compliance over sycophancy

Prompt Injection nlp
PDF Code