Yiwei Chen

h-index: 4 35 citations 7 papers (total)

Papers in Database (2)

attack arXiv Oct 19, 2025 · Oct 2025

Forgetting to Forget: Attention Sink as A Gateway for Backdooring LLM Unlearning

Bingqi Shang, Yiwei Chen, Yihua Zhang et al. · Michigan State University · National University of Singapore +1 more

Backdoors LLM unlearning via attention sink positions so models appear to forget but covertly restore knowledge when triggered

Model Poisoning nlp
1 citations PDF Code
benchmark arXiv Nov 7, 2025 · Nov 2025

Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding

Hadi Reisizadeh, Jiajun Ruan, Yiwei Chen et al. · University of Minnesota · Michigan State University +1 more

Exposes that all major LLM unlearning methods still leak private/hazardous training data under probabilistic sampling; introduces leak@k metric and RULE defense.

Model Inversion Attack Sensitive Information Disclosure nlp
1 citations PDF