Arijit Sur

h-index: 0 0 citations 5 papers (total)

Papers in Database (2)

attack arXiv Dec 9, 2025 · Dec 2025

Universal Adversarial Suffixes Using Calibrated Gumbel-Softmax Relaxation

Sampriti Soor, Suklav Ghosh, Arijit Sur · Indian Institute of Technology Guwahati

Gradient-optimized universal adversarial token suffixes degrade LLM classifiers across tasks and model families via Gumbel-Softmax relaxation

Input Manipulation Attack Prompt Injection nlp
PDF
attack arXiv Dec 9, 2025 · Dec 2025

Universal Adversarial Suffixes for Language Models Using Reinforcement Learning with Calibrated Reward

Sampriti Soor, Suklav Ghosh, Arijit Sur · arXiv · Indian Institute of Technology Guwahati

RL-trained adversarial suffixes degrade LLM classification accuracy using PPO and calibrated cross-entropy, outperforming gradient-based triggers in transferability

Input Manipulation Attack nlp
PDF