Latest papers

2 papers
defense arXiv Feb 11, 2026 · 7w ago

Safety Recovery in Reasoning Models Is Only a Few Early Steering Steps Away

Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh et al. · College Park · IIT Bombay +1 more

Inference-time defense for multimodal reasoning VLMs that monitors reasoning traces and steers safety within 1-3 steps to cut jailbreak ASR by 30-60%

Input Manipulation Attack Prompt Injection multimodalnlp
PDF
benchmark arXiv Nov 16, 2025 · Nov 2025

On Robustness of Linear Classifiers to Targeted Data Poisoning

Nakshatra Gupta, Sumanth Prabhu, Supratik Chakraborty et al. · Tata Consultancy Services · Relyance AI +1 more

Proves NP-hardness of targeted label-flipping poisoning robustness and computes tight bounds for linear classifiers

Data Poisoning Attack tabular
PDF