Yang Bai

h-index: 3 12 citations 6 papers (total)

Papers in Database (2)

attack arXiv Nov 10, 2025 · Nov 2025

JPRO: Automated Multimodal Jailbreaking via Multi-Agent Collaboration Framework

Yuxuan Zhou, Yang Bai, Kuofeng Gao et al. · Tsinghua University · ByteDance +1 more

Multi-agent framework automates black-box jailbreaking of VLMs via coordinated image-text pair generation, achieving 60%+ ASR on GPT-4o

Prompt Injection multimodalnlp
PDF
attack arXiv Nov 11, 2025 · Nov 2025

Why does weak-OOD help? A Further Step Towards Understanding Jailbreaking VLMs

Yuxuan Zhou, Yuzhao Peng, Yang Bai et al. · Tsinghua University · ByteDance +4 more

Analyzes why mild OOD image manipulation best jailbreaks VLMs, then proposes JOCR, an OCR-based visual attack outperforming SOTA baselines

Input Manipulation Attack Prompt Injection visionmultimodalnlp
PDF