Sharanya Dasgupta

defense arXiv Jan 7, 2026 · 12w ago

Sharanya Dasgupta, Arkaprabha Basu, Sujoy Nath et al. · Indian Statistical Institute · University of Surrey +1 more

Defends LLMs against jailbreaks and hallucinations by steering hidden states via GAN-trained intervention without fine-tuning

Prompt Injection nlp

Papers in Database (1)