Haowei Chang

Papers in Database (1)

defense arXiv Aug 8, 2025 · Aug 2025

SLIP: Soft Label Mechanism and Key-Extraction-Guided CoT-based Defense Against Instruction Backdoor in APIs

Zhengxian Wu, Juan Wen, Wanli Peng et al. · China Agricultural University

Defends LLM APIs against instruction backdoors by extracting task-relevant key phrases and filtering trigger-induced anomalous semantic scores

Model Poisoning nlp
PDF Code