Latest papers

1,139 papers
attack arXiv Apr 2, 2026 · 4d ago

Spike-PTSD: A Bio-Plausible Adversarial Example Attack on Spiking Neural Networks via PTSD-Inspired Spike Scaling

Lingxin Jin, Wei Jiang, Maregu Assefa Habtie et al. · University of Electronic Science and Technology · Khalifa University

Bio-inspired adversarial attack on Spiking Neural Networks achieving 99% success by exploiting PTSD-like abnormal neuron firing patterns

Input Manipulation Attack vision
PDF Code
attack arXiv Apr 2, 2026 · 4d ago

Tex3D: Objects as Attack Surfaces via Adversarial 3D Textures for Vision-Language-Action Models

Jiawei Chen, Simin Huang, Jiawei Du et al. · East China Normal University · Zhongguancun Academy +3 more

Physically realizable 3D adversarial textures that degrade vision-language-action robot models with 96.7% task failure rates

Input Manipulation Attack visionmultimodalnlp
PDF Code
attack arXiv Apr 2, 2026 · 4d ago

CRaFT: Circuit-Guided Refusal Feature Selection via Cross-Layer Transcoders

Su-Hyeon Kim, Hyundong Jin, Yejin Lee et al. · Yonsei University

Circuit-guided feature selection for LLM jailbreaking that identifies causal refusal features via cross-layer transcoders and boundary prompts

Prompt Injection nlp
PDF
attack arXiv Apr 2, 2026 · 4d ago

Low-Effort Jailbreak Attacks Against Text-to-Image Safety Filters

Ahmed B Mustafa, Zihan Ye, Yang Lu et al. · University of Nottingham · Xi’an Jiaotong-Liverpool University +1 more

Low-effort prompt-based jailbreaks bypass text-to-image safety filters using linguistic reframing, achieving 74% attack success

Prompt Injection multimodalgenerative
PDF
attack arXiv Apr 1, 2026 · 5d ago

Out of Sight, Out of Track: Adversarial Attacks on Propagation-based Multi-Object Trackers via Query State Manipulation

Halima Bouzidi, Haoyu Liu, Yonatan Gizachew Achamyeleh et al. · University of California

Adversarial attacks on multi-object trackers that flood query budgets and corrupt temporal memory to force track terminations

Input Manipulation Attack vision
PDF
attack arXiv Apr 1, 2026 · 5d ago

Enhancing Gradient Inversion Attacks in Federated Learning via Hierarchical Feature Optimization

Hao Fang, Wenbo Yu, Bin Chen et al. · Tsinghua University · Harbin Institute of Technology

GAN-based gradient inversion attack reconstructing client training data from FL gradients via hierarchical feature optimization

Model Inversion Attack visionfederated-learning
PDF
attack arXiv Apr 1, 2026 · 5d ago

G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

Ravi Ranjan, Utkarsh Grover, Xiaomin Lin et al. · Florida International University · University of South Florida

White-box membership inference attack using gradient-induced feature drift, outperforming confidence-based and reference-based MIAs on LLMs

Membership Inference Attack nlp
PDF
attack arXiv Apr 1, 2026 · 5d ago

Adversarial Attenuation Patch Attack for SAR Object Detection

Yiming Zhang, Weibo Qin, Feng Wang · Fudan University

Adversarial patch attack on SAR target detection achieving stealthiness and physical realizability through energy-constrained optimization

Input Manipulation Attack vision
PDF Code
attack arXiv Apr 1, 2026 · 5d ago

AutoMIA: Improved Baselines for Membership Inference Attack via Agentic Self-Exploration

Ruhao Liu, Weiqi Huang, Qi Li et al. · National University of Singapore

Agentic framework that automates membership inference attacks through self-exploration and strategy evolution, outperforming handcrafted baselines

Membership Inference Attack
PDF Code
attack arXiv Apr 1, 2026 · 5d ago

SERSEM: Selective Entropy-Weighted Scoring for Membership Inference in Code Language Models

Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar et al. · Bilkent University

White-box membership inference attack on code LLMs using AST-weighted entropy scoring to detect memorized training data

Membership Inference Attack nlp
PDF
attack arXiv Apr 1, 2026 · 5d ago

When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

Jiaqing Li, Zhibo Zhang, Shide Zhou et al. · Huazhong University of Science and Technology · Hubei University

Embeds latent trojans in individually safe LLMs that activate during model merging, bypassing safety alignment

Model Poisoning AI Supply Chain Attacks Prompt Injection nlp
PDF
attack arXiv Apr 1, 2026 · 5d ago

Fluently Lying: Adversarial Robustness Can Be Substrate-Dependent

Daye Kang, Hyeongboo Baek · University of Seoul

Discovers substrate-dependent adversarial failure mode where SNN detectors maintain detection count while accuracy collapses under standard PGD

Input Manipulation Attack vision
PDF
attack arXiv Apr 1, 2026 · 5d ago

Thinking Wrong in Silence: Backdoor Attacks on Continuous Latent Reasoning

Swapnil Parekh · Intuit

Backdoor attack on tokenless reasoning models that hijacks continuous latent trajectories via single embedding perturbations, achieving 99%+ success while evading all token-level defenses

Model Poisoning Data Poisoning Attack nlp
PDF
attack arXiv Mar 31, 2026 · 6d ago

SHIFT: Stochastic Hidden-Trajectory Deflection for Removing Diffusion-based Watermark

Rui Bao, Zheng Gao, Xiaoyu Li et al. · University of New South Wales · Griffith University

Training-free attack that removes diffusion-based watermarks by deflecting generation trajectories, achieving 95-100% success across nine methods

Output Integrity Attack visiongenerative
PDF
attack arXiv Mar 31, 2026 · 6d ago

Adversarial Prompt Injection Attack on Multimodal Large Language Models

Meiwen Ding, Song Xia, Chenqi Kong et al. · Nanyang Technological University

Embeds imperceptible adversarial prompts into images via visual perturbations to jailbreak closed-source multimodal LLMs

Input Manipulation Attack Prompt Injection multimodalvisionnlp
PDF
attack arXiv Mar 31, 2026 · 6d ago

Dummy-Aware Weighted Attack (DAWA): Breaking the Safe Sink in Dummy Class Defenses

Yunrui Yu, Xuxiang Feng, Pengda Qin et al. · Tsinghua University · University of Macau +1 more

Novel adversarial attack targeting dummy-class defenses by simultaneously attacking true and dummy labels with adaptive weighting

Input Manipulation Attack vision
PDF
attack arXiv Mar 31, 2026 · 6d ago

Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning

Kavindu Herath, Joshua Zhao, Saurabh Bagchi · Purdue University

Backdoor attack on federated learning using semantic triggers like sunglasses that evade robust aggregation defenses

Model Poisoning Data Poisoning Attack visionfederated-learning
PDF
attack arXiv Mar 30, 2026 · 7d ago

InkDrop: Invisible Backdoor Attacks Against Dataset Condensation

He Yang, Dongyi Lv, Song Ma et al. · Xi'an Jiaotong University · Tsinghua University

Stealthy backdoor attack on dataset condensation using boundary-proximate samples and imperceptible perturbations to evade detection

Model Poisoning vision
PDF Code
attack arXiv Mar 30, 2026 · 7d ago

\texttt{ReproMIA}: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

Chihan Huang, Huaijin Wang, Shuai Wang · HKUST

Novel membership inference attack using model reprogramming to amplify privacy leakage signals across LLMs, diffusion models, and classifiers

Membership Inference Attack nlpvisiongenerative
PDF
attack arXiv Mar 30, 2026 · 7d ago

Membership Inference Attacks against Large Audio Language Models

Jia-Kai Dong, Yu-Xiang Lin, Hung-Yi Lee · National Taiwan University · NTU Artificial Intelligence Center of Research Excellence

First systematic membership inference attack evaluation of audio language models, revealing cross-modal memorization from speaker-text binding

Membership Inference Attack audiomultimodalnlp
PDF
Loading more papers…