Yihua Zhang

attack arXiv Oct 12, 2025 · Oct 2025

One Token Embedding Is Enough to Deadlock Your Large Reasoning Model

Mohan Zhang, Yihua Zhang, Jinghan Jia et al. · University of North Carolina at Chapel Hill · Michigan State University +1 more

Backdoor-implanted attack on large reasoning models forcing perpetual CoT loops, achieving 100% resource exhaustion success rate

Model Poisoning Model Denial of Service nlp

1 citations PDF

attack arXiv Oct 19, 2025 · Oct 2025

Forgetting to Forget: Attention Sink as A Gateway for Backdooring LLM Unlearning

Bingqi Shang, Yiwei Chen, Yihua Zhang et al. · Michigan State University · National University of Singapore +1 more

Backdoors LLM unlearning via attention sink positions so models appear to forget but covertly restore knowledge when triggered

Model Poisoning nlp

1 citations PDF Code

defense arXiv Oct 1, 2025 · Oct 2025

Downgrade to Upgrade: Optimizer Simplification Enhances Robustness in LLM Unlearning

Yicheng Lang, Yihua Zhang, Chongyu Fan et al. · Michigan State University · IBM Research

Shows zeroth-order optimizers produce tamper-resistant LLM unlearning, defending against relearning attacks that restore forgotten harmful or private content

Prompt Injection Sensitive Information Disclosure nlp

PDF

Papers in Database (3)

One Token Embedding Is Enough to Deadlock Your Large Reasoning Model

Forgetting to Forget: Attention Sink as A Gateway for Backdooring LLM Unlearning

Downgrade to Upgrade: Optimizer Simplification Enhances Robustness in LLM Unlearning