Latest papers

5 papers
defense arXiv Mar 11, 2026 · 26d ago

Attribution as Retrieval: Model-Agnostic AI-Generated Image Attribution

Hongsong Wang, Renxi Cheng, Chaolei Han et al. · Southeast University · Purple Mountain Laboratories

Model-agnostic deepfake attribution framework using low-bit fingerprints and retrieval for zero- and few-shot source attribution

Output Integrity Attack vision
PDF Code
defense arXiv Aug 30, 2025 · Aug 2025

Activation Steering Meets Preference Optimization: Defense Against Jailbreaks in Vision Language Models

Sihao Wu, Gaojie Jin, Wei Huang et al. · University of Liverpool · University of Exeter +2 more

Defends VLMs against visual adversarial jailbreaks via adaptive activation steering vectors refined through sequence-level preference optimization

Input Manipulation Attack Prompt Injection multimodalvisionnlp
PDF
attack arXiv Aug 19, 2025 · Aug 2025

Backdooring Self-Supervised Contrastive Learning by Noisy Alignment

Tuo Chen, Jie Gui, Minjing Dong et al. · Southeast University · Ant Group +3 more

Data poisoning backdoor attack on self-supervised contrastive learning via optimized noisy image alignment that evades common defenses

Model Poisoning Data Poisoning Attack vision
PDF Code
defense arXiv Aug 16, 2025 · Aug 2025

SafeCtrl: Region-Based Safety Control for Text-to-Image Diffusion via Detect-Then-Suppress

Lingyun Zhang, Yu Xie, Yanwei Fu et al. · Fudan University · Purple Mountain Laboratories

Detect-then-suppress safety plugin localizes and suppresses harmful content in diffusion model outputs while preserving image fidelity

Output Integrity Attack visiongenerative
PDF
defense arXiv Aug 5, 2025 · Aug 2025

Heterogeneity-Oblivious Robust Federated Learning

Weiyao Zhang, Jinyang Li, Qi Song et al. · Chinese Academy of Sciences · University of Chinese Academy of Sciences +1 more

Defends federated learning against poisoning attacks in heterogeneous settings using LoRA-based client filtering and projection-aware aggregation

Data Poisoning Attack Model Poisoning federated-learning
PDF