Palash Nandi

Papers in Database (2)

attack arXiv Sep 19, 2025 · Sep 2025

SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection

Maithili Joshi, Palash Nandi, Tanmoy Chakraborty · Indian Institute of Technology Delhi

White-box jailbreak bypasses LLM safety alignment by adding cross-layer residual connections through middle-to-late layers, beating GCG by 51%

Prompt Injection nlp
PDF Code
attack arXiv Mar 15, 2026 · 24d ago

Exposing Long-Tail Safety Failures in Large Language Models through Efficient Diverse Response Sampling

Suvadeep Hajra, Palash Nandi, Tanmoy Chakraborty · Indian Institute of Technology Delhi

Efficient red-teaming method that uncovers LLM jailbreaks through diverse response sampling rather than adversarial prompt optimization

Prompt Injection nlp
PDF