Jizhong Han

defense arXiv Apr 24, 2026 · 27d ago

Wenjie Xiao, Xuehai Tang, Biyu Zhou et al. · University of Chinese Academy of Sciences · Chinese Academy of Sciences

Detects poisoned LLM agent skills by identifying attention hijacking patterns where malicious instructions redirect model reasoning

Prompt Injection Excessive Agency nlp

Papers in Database (1)