Murari Mandal

Papers in Database (1)

defense arXiv Sep 6, 2025 · Sep 2025

AntiDote: Bi-level Adversarial Training for Tamper-Resistant LLMs

Debdeep Sanyal, Manodeep Ray, Murari Mandal · KIIT

Defends open-weight LLMs against malicious fine-tuning via bi-level adversarial training with a LoRA-generating hypernetwork adversary

Transfer Learning Attack Prompt Injection nlp
PDF