Latest papers

7 papers
attack arXiv Mar 10, 2026 · 29d ago

Reasoning-Oriented Programming: Chaining Semantic Gadgets to Jailbreak Large Vision Language Models

Quanchen Zou, Moyang Chen, Zonghao Ying et al. · 360 AI Security Lab · Wenzhou-Kean University +1 more

Jailbreaks VLMs by chaining semantically benign visual gadgets via prompt-controlled reasoning to synthesize harmful outputs, bypassing perception-level alignment

Input Manipulation Attack Prompt Injection visionnlpmultimodal
PDF
attack arXiv Mar 7, 2026 · 4w ago

Two Frames Matter: A Temporal Attack for Text-to-Video Model Jailbreaking

Moyang Chen, Zonghao Ying, Wenzhuo Xu et al. · Wenzhou-Kean University · 360 AI Security Lab +1 more

Jailbreaks text-to-video models by exploiting temporal infilling: sparse boundary-frame prompts induce harmful intermediate content generation

Prompt Injection multimodalgenerative
PDF
defense arXiv Jan 24, 2026 · 10w ago

Robust Privacy: Inference-Time Privacy through Certified Robustness

Jiankai Jin, Xiangzheng Zhang, Zhao Liu et al. · 360 AI Security Lab

Repurposes certified robustness as inference-time privacy, reducing model inversion attack success rate from 73% to 4%

Model Inversion Attack visiontabular
PDF
attack arXiv Nov 17, 2025 · Nov 2025

VEIL: Jailbreaking Text-to-Video Models via Visual Exploitation from Implicit Language

Zonghao Ying, Moyang Chen, Nizhang Li et al. · Beihang University · Wenzhou-Kean University +4 more

Jailbreaks text-to-video models using benign prompts with auditory triggers and cinematic cues that exploit cross-modal priors

Prompt Injection multimodalgenerativevisionnlp
1 citations PDF Code
attack arXiv Oct 16, 2025 · Oct 2025

Sequential Comics for Jailbreaking Multimodal Large Language Models via Structured Visual Storytelling

Deyue Zhang, Dongdong Yang, Junjie Mu et al. · 360 AI Security Lab · Politecnico di Milano +1 more

Jailbreaks multimodal LLMs with diffusion-generated comic sequences that exploit narrative coherence to bypass safety alignment

Input Manipulation Attack Prompt Injection visionnlpmultimodalgenerative
1 citations PDF
benchmark arXiv Oct 11, 2025 · Oct 2025

SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

Zonghao Ying, Yangguang Shao, Jianle Gan et al. · Beihang University · Chinese Academy of Sciences +7 more

Benchmark evaluating LVLM web agent security across six attack vectors in realistic web environments, exposing universal vulnerabilities across 9 models

Prompt Injection Excessive Agency multimodalnlp
5 citations PDF
attack arXiv Sep 8, 2025 · Sep 2025

Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?

Junjie Mu, Zonghao Ying, Zhekui Fan et al. · Beihang University · 360 AI Security Lab +4 more

Identifies redundant tokens in GCG adversarial suffixes via learnable masking, reducing LLM jailbreak attack time by 16.8%.

Input Manipulation Attack Prompt Injection nlp
PDF