Guozhi Liu

defense arXiv Oct 11, 2025 · Oct 2025

Guozhi Liu, Qi Mu, Tiansheng Huang et al. · South China University of Technology · Ltd. +4 more

Curates safety-critical alignment data subsets to harden LLMs against harmful fine-tuning attacks while cutting training time by ~57%

Transfer Learning Attack Prompt Injection nlp

2 citations 1 influentialPDF Code

defense arXiv Feb 5, 2026 · 8w ago

Guozhi Liu, Weiwei Lin, Tiansheng Huang et al. · South China University of Technology · Pengcheng Laboratory +1 more

Defends LLM safety alignment during fine-tuning by regularizing attention sink divergence to prevent harmful pattern learning

Transfer Learning Attack nlp

Papers in Database (2)