Zihan Wang

Papers in Database (2)

defense arXiv Aug 2, 2025 · Aug 2025

ConfGuard: A Simple and Effective Backdoor Detection for Large Language Models

Zihan Wang, Rui Zhang, Hongwei Li et al. · University of Electronic Science and Technology of China · City University of Hong Kong

Detects LLM backdoors in real-time by monitoring token confidence windows that reveal the 'sequence lock' phenomenon

Model Poisoning nlp
PDF Code
attack arXiv Aug 26, 2025 · Aug 2025

Hidden Tail: Adversarial Image Causing Stealthy Resource Consumption in Vision-Language Models

Rui Zhang, Zihan Wang, Tianli Yang et al. · University of Electronic Science and Technology of China · City University of Hong Kong +1 more

Adversarial image attack on VLMs that maximizes output length via hidden special tokens, exhausting inference resources stealthily

Input Manipulation Attack Model Denial of Service visionmultimodalnlp
PDF Code