Latest papers

7 papers
attack arXiv Mar 23, 2026 · 16d ago

Adversarial Vulnerabilities in Neural Operator Digital Twins: Gradient-Free Attacks on Nuclear Thermal-Hydraulic Surrogates

Samrendra Roy, Kazuma Kobayashi, Souvik Chakraborty et al. · University of Illinois Urbana-Champaign · Indian Institute of Technology Delhi +1 more

Gradient-free adversarial attacks on neural operator digital twins causing catastrophic field prediction failures through sparse physically-plausible perturbations

Input Manipulation Attack vision
PDF
attack arXiv Mar 15, 2026 · 24d ago

Exposing Long-Tail Safety Failures in Large Language Models through Efficient Diverse Response Sampling

Suvadeep Hajra, Palash Nandi, Tanmoy Chakraborty · Indian Institute of Technology Delhi

Efficient red-teaming method that uncovers LLM jailbreaks through diverse response sampling rather than adversarial prompt optimization

Prompt Injection nlp
PDF
defense arXiv Jan 7, 2026 · Jan 2026

ARREST: Adversarial Resilient Regulation Enhancing Safety and Truth in Large Language Models

Sharanya Dasgupta, Arkaprabha Basu, Sujoy Nath et al. · Indian Statistical Institute · University of Surrey +1 more

Defends LLMs against jailbreaks and hallucinations by steering hidden states via GAN-trained intervention without fine-tuning

Prompt Injection nlp
PDF Code
attack arXiv Nov 16, 2025 · Nov 2025

Backdoor Attacks on Open Vocabulary Object Detectors via Multi-Modal Prompt Tuning

Ankita Raj, Chetan Arora · Indian Institute of Technology Delhi

Injects backdoors into open-vocabulary object detectors via multi-modal prompt tuning without retraining base model weights

Model Poisoning Transfer Learning Attack visionmultimodal
PDF Code
defense arXiv Nov 8, 2025 · Nov 2025

Enhancing Robustness of Graph Neural Networks through p-Laplacian

Anuj Kumar Sirohi, Subhanu Halder, Kabir Kumar et al. · Indian Institute of Technology Delhi

Defends GNNs against poisoning and evasion attacks using a weighted p-Laplacian smoothing framework that scales better at high attack intensities

Input Manipulation Attack Data Poisoning Attack graph
PDF Code
attack arXiv Oct 17, 2025 · Oct 2025

Constrained Adversarial Perturbation

Virendra Nishad, Bhaskar Mukhoty, Hilal AlQuabeh et al. · Indian Institute of Technology Kanpur · Indian Institute of Technology Delhi +2 more

Proposes CAP, constraint-aware universal adversarial perturbations for tabular domains via augmented Lagrangian min-max optimization

Input Manipulation Attack tabular
PDF
attack arXiv Sep 19, 2025 · Sep 2025

SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection

Maithili Joshi, Palash Nandi, Tanmoy Chakraborty · Indian Institute of Technology Delhi

White-box jailbreak bypasses LLM safety alignment by adding cross-layer residual connections through middle-to-late layers, beating GCG by 51%

Prompt Injection nlp
PDF Code