Jin Song Dong

Papers in Database (5)

attack arXiv Aug 14, 2025 · Aug 2025

Failures to Surface Harmful Contents in Video Large Language Models

Yuxin Cao, Wei Song, Derui Wang et al. · National University of Singapore · University of New South Wales +1 more

Three black-box attacks exploit VideoLLM architectural blind spots to hide harmful video content from generated summaries with >90% success rate

Input Manipulation Attack Prompt Injection multimodalvisionnlp
PDF Code
attack arXiv Feb 17, 2026 · 6w ago

Zombie Agents: Persistent Control of Self-Evolving LLM Agents via Self-Reinforcing Injections

Xianglin Yang, Yufei He, Shuo Ji et al. · National University of Singapore

Persistent cross-session attack poisons LLM agent memory via indirect web injection, causing unauthorized tool actions across future sessions

Prompt Injection Excessive Agency nlp
PDF
defense arXiv Aug 5, 2025 · Aug 2025

Seeing It Before It Happens: In-Generation NSFW Detection for Diffusion-Based Text-to-Image Models

Fan Yang, Yihao Huang, Jiayi Zhu et al. · Huazhong University of Science and Technology · National University of Singapore +2 more

Defends diffusion T2I models against NSFW generation by classifying predicted noise mid-generation, robust to adversarial prompts

Output Integrity Attack visiongenerative
PDF
attack arXiv Aug 14, 2025 · Aug 2025

Towards Powerful and Practical Patch Attacks for 2D Object Detection in Autonomous Driving

Yuxin Cao, Yedi Zhang, Wentao He et al. · Changan Automobile · National University of Singapore +2 more

Adversarial patch attack for autonomous driving pedestrian detection with novel IoU-based loss and high-resolution transferability technique

Input Manipulation Attack vision
PDF
attack arXiv Aug 4, 2025 · Aug 2025

Towards Stealthy and Effective Backdoor Attacks on Lane Detection: A Naturalistic Data Poisoning Approach

Yifan Liao, Yuxin Cao, Yedi Zhang et al. · Changan Automobile · National University of Singapore +2 more

Diffusion-based backdoor attack on lane detection models using naturalistic triggers with gradient-guided optimal placement

Model Poisoning Data Poisoning Attack vision
PDF