Latest papers

26 papers
defense arXiv Apr 2, 2026 · 4d ago

Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations

Yue Li, Linying Xue, Kaiqing Lin et al. · National Huaqiao University · Shenzhen University +2 more

Diffusion-guided adversarial perturbation defense protecting facial images from deepfake manipulation in both white-box and black-box settings

Input Manipulation Attack visiongenerative
PDF
tool arXiv Mar 31, 2026 · 6d ago

GazeCLIP: Gaze-Guided CLIP with Adaptive-Enhanced Fine-Grained Language Prompt for Deepfake Attribution and Detection

Yaning Zhang, Linlin Shen, Zitong Yu et al. · Qilu University of Technology · Shenzhen University +2 more

Deepfake detector using gaze patterns and CLIP-based vision-language matching to attribute and detect GAN/diffusion-generated faces

Output Integrity Attack visionmultimodal
PDF
defense arXiv Mar 27, 2026 · 10d ago

Gaussian Shannon: High-Precision Diffusion Model Watermarking Based on Communication

Yi Zhang, Hongbo Huang, Liang-Jie Zhang · Shenzhen University

Embeds bit-exact watermarks in diffusion model noise using error-correction codes for lossless AI image authentication and copyright tracking

Output Integrity Attack visiongenerative
PDF Code
tool arXiv Mar 24, 2026 · 13d ago

AgentFoX: LLM Agent-Guided Fusion with eXplainability for AI-Generated Image Detection

Yangxin Yu, Yue Zhou, Bin Li et al. · Shenzhen University · Sun Yat-Sen University +1 more

LLM-guided fusion framework that combines multiple forensic detectors to identify AI-generated images with explainable verdicts

Output Integrity Attack visionmultimodalnlp
PDF
attack arXiv Mar 20, 2026 · 17d ago

Evolving Jailbreaks: Automated Multi-Objective Long-Tail Attacks on Large Language Models

Wenjing Hong, Zhonghua Rong, Li Wang et al. · Shenzhen University · Ltd +2 more

Automated multi-objective evolutionary search framework discovering diverse long-tail jailbreak attacks via encryption-decryption prompt transformations

Prompt Injection nlp
PDF
defense arXiv Mar 12, 2026 · 25d ago

ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models

Yingxin Lai, Zitong Yu, Jun Wang et al. · Great Bay University · Shenzhen University +2 more

Forensic-aware visual token pruning for deepfake/AIGC detection VLMs using Birth-Death Optimal Transport to preserve manipulation traces

Output Integrity Attack visionmultimodalnlp
PDF Code
attack arXiv Feb 6, 2026 · 8w ago

Universal Anti-forensics Attack against Image Forgery Detection via Multi-modal Guidance

Haipeng Li, Rongxuan Peng, Anwei Luo et al. · Shenzhen University · Nanyang Technological University +2 more

Adversarial perturbations that evade AI-generated content detectors by manipulating shared CLIP embeddings toward authentic anchors

Input Manipulation Attack Output Integrity Attack visionmultimodal
PDF
defense arXiv Feb 2, 2026 · 9w ago

Simplicity Prevails: The Emergence of Generalizable AIGI Detection in Visual Foundation Models

Yue Zhou, Xinan He, Kaiqing Lin et al. · Shenzhen University · NanChang University +1 more

Linear classifiers on frozen Vision Foundation Models outperform specialized AIGI detectors by 30%+ in realistic in-the-wild scenarios

Output Integrity Attack vision
PDF
defense arXiv Feb 2, 2026 · 9w ago

MIRROR: Manifold Ideal Reference ReconstructOR for Generalizable AI-Generated Image Detection

Ruiqi Liu, Manni Cui, Ziheng Qin et al. · Institute of Automation · School of Advanced Interdisciplinary Sciences +7 more

Detects AI-generated images by projecting inputs to a real-image manifold and using reconstruction residuals as forgery signals, surpassing human experts

Output Integrity Attack visiongenerative
PDF Code
attack arXiv Feb 2, 2026 · 9w ago

MarkCleaner: High-Fidelity Watermark Removal via Imperceptible Micro-Geometric Perturbation

Xiaoxi Kong, Jieyu Yuan, Pengdi Chen et al. · Shenzhen University · Nankai University

Removes semantic AI-image watermarks via micro-geometric perturbations that break phase alignment without semantic drift

Output Integrity Attack visiongenerative
PDF
defense arXiv Jan 29, 2026 · 9w ago

MPF-Net: Exposing High-Fidelity AI-Generated Video Forgeries via Hierarchical Manifold Deviation and Micro-Temporal Fluctuations

Xinan He, Kaiqing Lin, Yue Zhou et al. · NanChang University · Shenzhen University +3 more

Detects AI-generated video forgeries via hierarchical dual-path analysis of manifold deviations and structured inter-frame residual fingerprints

Output Integrity Attack vision
PDF
defense arXiv Dec 7, 2025 · Dec 2025

AlignGemini: Generalizable AI-Generated Image Detection Through Task-Model Alignment

Ruoxin Chen, Jiahui Gao, Kaiqing Lin et al. · Tencent · East China University of Science and Technology +2 more

Proposes task-model alignment combining VLMs and vision models for generalizable AI-generated image detection

Output Integrity Attack visionmultimodal
PDF
defense arXiv Nov 24, 2025 · Nov 2025

Towards Generalizable Deepfake Detection via Forgery-aware Audio-Visual Adaptation: A Variational Bayesian Approach

Fan Nie, Jiangqun Ni, Jian Zhang et al. · Sun Yat-Sen University · Pengcheng Laboratory +4 more

Novel variational Bayesian framework detects audio-visual deepfakes by modeling cross-modal inconsistencies as Gaussian latent variables

Output Integrity Attack multimodalvisionaudiogenerative
1 citations PDF
defense arXiv Nov 13, 2025 · Nov 2025

Fairness-Aware Deepfake Detection: Leveraging Dual-Mechanism Optimization

Feng Ding, Wenhui Yi, Yunpeng Zhou et al. · NanChang University · Shenzhen University +1 more

Fairness-aware deepfake detector using channel decoupling and distribution alignment to reduce demographic bias without sacrificing accuracy

Output Integrity Attack vision
PDF
attack arXiv Nov 11, 2025 · Nov 2025

Why does weak-OOD help? A Further Step Towards Understanding Jailbreaking VLMs

Yuxuan Zhou, Yuzhao Peng, Yang Bai et al. · Tsinghua University · ByteDance +4 more

Analyzes why mild OOD image manipulation best jailbreaks VLMs, then proposes JOCR, an OCR-based visual attack outperforming SOTA baselines

Input Manipulation Attack Prompt Injection visionmultimodalnlp
PDF
attack arXiv Nov 10, 2025 · Nov 2025

JPRO: Automated Multimodal Jailbreaking via Multi-Agent Collaboration Framework

Yuxuan Zhou, Yang Bai, Kuofeng Gao et al. · Tsinghua University · ByteDance +1 more

Multi-agent framework automates black-box jailbreaking of VLMs via coordinated image-text pair generation, achieving 60%+ ASR on GPT-4o

Prompt Injection multimodalnlp
PDF
defense arXiv Nov 10, 2025 · Nov 2025

Improving Deepfake Detection with Reinforcement Learning-Based Adaptive Data Augmentation

Yuxuan Zhou, Tao Yu, Wen Huang et al. · Tsinghua University · CASIA +1 more

Trains deepfake detectors with RL-adaptive curriculum augmentation and causal inference to generalize across unseen forgery domains

Output Integrity Attack vision
PDF
attack arXiv Nov 1, 2025 · Nov 2025

Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling

Zenghao Niu, Weicheng Xie, Siyang Song et al. · Shenzhen University · Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ) +3 more

Gradient-guided sampling attack improves adversarial transferability across DNNs and VLMs by balancing loss flatness and attack potency

Input Manipulation Attack Prompt Injection visionmultimodal
PDF Code
benchmark arXiv Oct 11, 2025 · Oct 2025

Semantic Visual Anomaly Detection and Reasoning in AI-Generated Images

Chuangchuang Tan, Xiang Ming, Jinglu Wang et al. · Beijing Jiaotong University · Microsoft Research Asia +1 more

Benchmark and evaluation framework for detecting semantic anomalies in AI-generated images, targeting deepfake detection and AIGC authenticity

Output Integrity Attack visionmultimodal
PDF
defense arXiv Oct 5, 2025 · Oct 2025

COSMO-RL: Towards Trustworthy LMRMs via Joint Safety and Stability

Yizhuo Ding, Mingkang Chen, Qiuhua Liu et al. · Fudan University · Shanghai AI Laboratory +3 more

Defends large multimodal reasoning models against jailbreaks via multi-objective RL that jointly optimizes safety and reasoning capability

Prompt Injection multimodalnlpvisionreinforcement-learning
PDF
Loading more papers…