Latest papers

839 papers
attack arXiv Apr 2, 2026 · 4d ago

Spike-PTSD: A Bio-Plausible Adversarial Example Attack on Spiking Neural Networks via PTSD-Inspired Spike Scaling

Lingxin Jin, Wei Jiang, Maregu Assefa Habtie et al. · University of Electronic Science and Technology · Khalifa University

Bio-inspired adversarial attack on Spiking Neural Networks achieving 99% success by exploiting PTSD-like abnormal neuron firing patterns

Input Manipulation Attack vision
PDF Code
defense arXiv Apr 2, 2026 · 4d ago

Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations

Yue Li, Linying Xue, Kaiqing Lin et al. · National Huaqiao University · Shenzhen University +2 more

Diffusion-guided adversarial perturbation defense protecting facial images from deepfake manipulation in both white-box and black-box settings

Input Manipulation Attack visiongenerative
PDF
attack arXiv Apr 2, 2026 · 4d ago

Tex3D: Objects as Attack Surfaces via Adversarial 3D Textures for Vision-Language-Action Models

Jiawei Chen, Simin Huang, Jiawei Du et al. · East China Normal University · Zhongguancun Academy +3 more

Physically realizable 3D adversarial textures that degrade vision-language-action robot models with 96.7% task failure rates

Input Manipulation Attack visionmultimodalnlp
PDF Code
attack arXiv Apr 1, 2026 · 5d ago

Out of Sight, Out of Track: Adversarial Attacks on Propagation-based Multi-Object Trackers via Query State Manipulation

Halima Bouzidi, Haoyu Liu, Yonatan Gizachew Achamyeleh et al. · University of California

Adversarial attacks on multi-object trackers that flood query budgets and corrupt temporal memory to force track terminations

Input Manipulation Attack vision
PDF
defense arXiv Apr 1, 2026 · 5d ago

Shapley-Guided Neural Repair Approach via Derivative-Free Optimization

Xinyu Sun, Wanwei Liu, Haoang Chi et al. · National University of Defense Technology · Nanjing University +1 more

Interpretable DNN repair using Shapley-guided fault localization and derivative-free optimization for backdoor removal, adversarial defense, and fairness

Input Manipulation Attack Model Poisoning vision
PDF
attack arXiv Apr 1, 2026 · 5d ago

Adversarial Attenuation Patch Attack for SAR Object Detection

Yiming Zhang, Weibo Qin, Feng Wang · Fudan University

Adversarial patch attack on SAR target detection achieving stealthiness and physical realizability through energy-constrained optimization

Input Manipulation Attack vision
PDF Code
defense arXiv Apr 1, 2026 · 5d ago

WARP: Guaranteed Inner-Layer Repair of NLP Transformers

Hsin-Ling Hsu, Min-Yu Chen, Nai-Chia Chen et al. · National Chengchi University

Constraint-based model repair framework providing provable guarantees for correcting adversarial misclassifications in NLP Transformers

Input Manipulation Attack nlp
PDF
survey arXiv Apr 1, 2026 · 5d ago

Safety, Security, and Cognitive Risks in World Models

Manoj Parmar · SovereignAI Security Labs

Unified threat model for world model AI systems covering adversarial attacks, data poisoning, alignment risks, and cognitive security

Input Manipulation Attack Data Poisoning Attack Model Poisoning Prompt Injection Excessive Agency reinforcement-learningmultimodalvisionnlp
PDF
attack arXiv Apr 1, 2026 · 5d ago

Fluently Lying: Adversarial Robustness Can Be Substrate-Dependent

Daye Kang, Hyeongboo Baek · University of Seoul

Discovers substrate-dependent adversarial failure mode where SNN detectors maintain detection count while accuracy collapses under standard PGD

Input Manipulation Attack vision
PDF
defense arXiv Apr 1, 2026 · 5d ago

PDA: Text-Augmented Defense Framework for Robust Vision-Language Models against Adversarial Image Attacks

Jingning Xu, Haochen Luo, Chen Liu · City University of Hong Kong

Training-free defense using text augmentation to protect VLMs against diverse adversarial image perturbations at inference time

Input Manipulation Attack multimodalvisionnlp
PDF
defense arXiv Mar 31, 2026 · 6d ago

Diffusion-Based Feature Denoising with NNMF for Robust handwritten digit multi-class classification

Hiba Adil Al-kharsan, Róbert Rajkó

Defends handwritten digit classifiers against adversarial examples using diffusion-based feature-space denoising with hybrid CNN-NNMF representations

Input Manipulation Attack vision
PDF
defense arXiv Mar 31, 2026 · 6d ago

Robust Multimodal Safety via Conditional Decoding

Anurag Kumar, Raghuveer Peri, Jon Burnsky et al. · The Ohio State University · AWS

Conditional decoding defense using internal safety classification that blocks multimodal jailbreaks across text, image, and audio inputs

Input Manipulation Attack Prompt Injection multimodalnlpvisionaudio
PDF
attack arXiv Mar 31, 2026 · 6d ago

Adversarial Prompt Injection Attack on Multimodal Large Language Models

Meiwen Ding, Song Xia, Chenqi Kong et al. · Nanyang Technological University

Embeds imperceptible adversarial prompts into images via visual perturbations to jailbreak closed-source multimodal LLMs

Input Manipulation Attack Prompt Injection multimodalvisionnlp
PDF
survey arXiv Mar 31, 2026 · 6d ago

The Persistent Vulnerability of Aligned AI Systems

Aengus Lynch · University College London

Comprehensive AI safety thesis spanning mechanistic interpretability, sleeper agent defenses, jailbreaking frontier models, and autonomous agent misalignment

Input Manipulation Attack Prompt Injection Excessive Agency nlpvisionaudiomultimodal
PDF
attack arXiv Mar 31, 2026 · 6d ago

Dummy-Aware Weighted Attack (DAWA): Breaking the Safe Sink in Dummy Class Defenses

Yunrui Yu, Xuxiang Feng, Pengda Qin et al. · Tsinghua University · University of Macau +1 more

Novel adversarial attack targeting dummy-class defenses by simultaneously attacking true and dummy labels with adaptive weighting

Input Manipulation Attack vision
PDF
defense arXiv Mar 31, 2026 · 6d ago

AGFT: Alignment-Guided Fine-Tuning for Zero-Shot Adversarial Robustness of Vision-Language Models

Yubo Cui, Xianchao Guan, Zijun Xiong et al. · Harbin Institute of Technology · Shenzhen Loop Area Institute

Adversarial fine-tuning framework that preserves vision-language alignment while defending CLIP against adversarial perturbations in zero-shot settings

Input Manipulation Attack visionnlpmultimodal
PDF Code
benchmark arXiv Mar 31, 2026 · 6d ago

Multimodal Models Meet Presentation Attack Detection on ID Documents

Marina Villanueva, Juan M. Espin, Juan E. Tapia · Facephi · Hochschule Darmstadt

Evaluates three vision-language models for detecting ID document presentation attacks using seven prompt types, finding poor generalization

Input Manipulation Attack visionmultimodal
PDF
defense arXiv Mar 30, 2026 · 7d ago

Lipschitz verification of neural networks through training

Simon Kuang, Yuezhu Xu, S. Sivaranjani et al. · University of California · Purdue University

Trains certifiably robust neural networks by penalizing the trivial Lipschitz bound during training, achieving tight provable robustness guarantees

Input Manipulation Attack vision
PDF
defense arXiv Mar 30, 2026 · 7d ago

Detection of Adversarial Attacks in Robotic Perception

Ziad Sharawy, Mohammad Nakshbandiand, Sorin Mihai Grigorescu · Transylvania University of Brașov

Statistical detection framework for adversarial attacks on semantic segmentation models in robotic perception systems

Input Manipulation Attack vision
PDF Code
defense arXiv Mar 30, 2026 · 7d ago

SFDemorpher: Generalizable Face Demorphing for Operational Morphing Attack Detection

Raul Ismayilov, Luuk Spreeuwers · University of Twente

Defends biometric systems against face morphing attacks by demorphing blended identities using StyleGAN latent spaces

Input Manipulation Attack vision
PDF
Loading more papers…