Shu-Tao Xia

attack arXiv Nov 10, 2025 · Nov 2025

Yuxuan Zhou, Yang Bai, Kuofeng Gao et al. · Tsinghua University · ByteDance +1 more

Multi-agent framework automates black-box jailbreaking of VLMs via coordinated image-text pair generation, achieving 60%+ ASR on GPT-4o

Prompt Injection multimodalnlp

defense arXiv Nov 10, 2025 · Nov 2025

Yuxuan Zhou, Tao Yu, Wen Huang et al. · Tsinghua University · CASIA +1 more

Trains deepfake detectors with RL-adaptive curriculum augmentation and causal inference to generalize across unseen forgery domains

Output Integrity Attack vision

attack arXiv Nov 11, 2025 · Nov 2025

Yuxuan Zhou, Yuzhao Peng, Yang Bai et al. · Tsinghua University · ByteDance +4 more

Analyzes why mild OOD image manipulation best jailbreaks VLMs, then proposes JOCR, an OCR-based visual attack outperforming SOTA baselines

Input Manipulation Attack Prompt Injection visionmultimodalnlp

Papers in Database (3)