Song Guo

Papers in Database (1)

defense arXiv Mar 2, 2026 · 5w ago

DualSentinel: A Lightweight Framework for Detecting Targeted Attacks in Black-box LLM via Dual Entropy Lull Pattern

Xiaoyi Pang, Xuanyi Hao, Pengyu Liu et al. · arXiv · The Hong Kong University of Science and Technology +1 more

Detects backdoor and prompt injection attacks in black-box LLMs by monitoring token entropy lulls during generation

Model Poisoning Prompt Injection nlp
PDF Code