Zhengxing Li

Papers in Database (1)

defense arXiv Sep 19, 2025 · Sep 2025

Inverting Trojans in LLMs

Zhengxing Li, Guangmingmei Yang, Jayaram Raghuram et al. · Penn State · Anomalee Inc.

Defends LLMs against backdoor attacks by inverting triggers via discrete greedy search and implicit activation-space blacklisting

Model Poisoning nlp
PDF