Pratik Mazumder

h-index: 0 0 citations 4 papers (total)

Papers in Database (1)

defense arXiv Feb 19, 2026 · 6w ago

Learning to Stay Safe: Adaptive Regularization Against Safety Degradation during Fine-Tuning

Jyotin Goel, Souvik Maji, Pratik Mazumder · Indian Institute of Technology Jodhpur

Defends LLMs from harmful fine-tuning attacks via adaptive KL regularization guided by a safety critic or activation-based risk predictor

Transfer Learning Attack Prompt Injection nlp
PDF Code