Sharanya Dasgupta

h-index: 1 5 citations 3 papers (total)

Papers in Database (1)

defense arXiv Jan 7, 2026 · 12w ago

ARREST: Adversarial Resilient Regulation Enhancing Safety and Truth in Large Language Models

Sharanya Dasgupta, Arkaprabha Basu, Sujoy Nath et al. · Indian Statistical Institute · University of Surrey +1 more

Defends LLMs against jailbreaks and hallucinations by steering hidden states via GAN-trained intervention without fine-tuning

Prompt Injection nlp
PDF Code