Latest papers

5 papers
attack arXiv Mar 21, 2026 · 18d ago

Adversarial Attacks on Locally Private Graph Neural Networks

Matta Varun, Ajay Kumar Dhakar, Yuan Hong et al. · Indian Institute of Technology Kharagpur · University of Connecticut

Analyzes adversarial attacks on LDP-protected GNNs, exploring how privacy noise affects attack effectiveness and robustness

Input Manipulation Attack Data Poisoning Attack graph
PDF
attack arXiv Mar 10, 2026 · 29d ago

Amnesia: Adversarial Semantic Layer Specific Activation Steering in Large Language Models

Ali Raza, Gurang Gupta, Nikolay Matyunin et al. · Honda Research Institute Europe · Indian Institute of Technology Kharagpur

Activation-steering attack manipulates internal transformer states to jailbreak open-weight LLMs without fine-tuning or gradient-based prompt optimization

Prompt Injection nlp
PDF
benchmark arXiv Dec 25, 2025 · Dec 2025

Fixed-Threshold Evaluation of a Hybrid CNN-ViT for AI-Generated Image Detection Across Photos and Art

Md Ashik Khan, Arafat Alam Jion · Indian Institute of Technology Kharagpur · Chittagong University of Engineering and Technology

Fixed-threshold evaluation protocol exposes genuine robustness gaps in AI-generated image detectors across CNN, ViT, and hybrid architectures

Output Integrity Attack vision
PDF
benchmark arXiv Oct 17, 2025 · Oct 2025

The Hidden Cost of Modeling P(X): Vulnerability to Membership Inference Attacks in Generative Text Classifiers

Owais Makroo, Siva Rajesh Kasa, Sumegh Roychowdhury et al. · Indian Institute of Technology Kharagpur · Amazon.com Inc.

Benchmarks MIA vulnerability across generative and discriminative text classifiers, proving generative P(X,Y) models leak membership most severely

Membership Inference Attack nlp
PDF Code
defense arXiv Sep 25, 2025 · Sep 2025

Adaptive Federated Learning Defences via Trust-Aware Deep Q-Networks

Vedant Palit · Indian Institute of Technology Kharagpur

Defends federated learning against poisoning and backdoor attacks using a trust-aware Deep Q-Network under partial observability

Model Poisoning Data Poisoning Attack federated-learningreinforcement-learningvision
PDF