Zhenyi Wang

h-index: 9 521 citations 18 papers (total)

Papers in Database (1)

defense arXiv Oct 31, 2025 · Oct 2025

Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler

Zixuan Hu, Li Shen, Zhenyi Wang et al. · Nanyang Technological University · Sun Yat-Sen University +2 more

Defends LLMs against harmful fine-tuning by learning data safety attributes via Bayesian inference without requiring attack simulation

Data Poisoning Attack Transfer Learning Attack Training Data Poisoning nlp
5 citations 1 influentialPDF Code