Yige Li

benchmark arXiv Mar 8, 2026 · 29d ago

Yige Li, Wei Zhao, Zhe Li et al. · Singapore Management University · The University of Melbourne +1 more

Benchmarks beneficial uses of LLM backdoors for safety enforcement, access control, and watermarking via trigger conditioning

Model Poisoning Prompt Injection nlp

defense arXiv Jan 5, 2025 · Jan 2025

Peihai Jiang, Xixiang Lyu, Yige Li et al. · Xidian University · Singapore Management University

Defends NLP fine-tuning against backdoor attacks by detecting aberrant trigger token embeddings and unlearning them during training

Model Poisoning nlp

Papers in Database (2)