Jin Song Dong

Papers in Database (6)

attack arXiv Feb 17, 2026 · Feb 2026

Zombie Agents: Persistent Control of Self-Evolving LLM Agents via Self-Reinforcing Injections

Xianglin Yang, Yufei He, Shuo Ji et al. · National University of Singapore

Persistent cross-session attack poisons LLM agent memory via indirect web injection, causing unauthorized tool actions across future sessions

Prompt Injection Excessive Agency nlp
PDF
attack arXiv Aug 14, 2025 · Aug 2025

Failures to Surface Harmful Contents in Video Large Language Models

Yuxin Cao, Wei Song, Derui Wang et al. · National University of Singapore · University of New South Wales +1 more

Three black-box attacks exploit VideoLLM architectural blind spots to hide harmful video content from generated summaries with >90% success rate

Input Manipulation Attack Prompt Injection multimodalvisionnlp
PDF Code
defense arXiv Aug 5, 2025 · Aug 2025

Seeing It Before It Happens: In-Generation NSFW Detection for Diffusion-Based Text-to-Image Models

Fan Yang, Yihao Huang, Jiayi Zhu et al. · Huazhong University of Science and Technology · National University of Singapore +2 more

Defends diffusion T2I models against NSFW generation by classifying predicted noise mid-generation, robust to adversarial prompts

Output Integrity Attack visiongenerative
PDF
attack arXiv Aug 14, 2025 · Aug 2025

Towards Powerful and Practical Patch Attacks for 2D Object Detection in Autonomous Driving

Yuxin Cao, Yedi Zhang, Wentao He et al. · Changan Automobile · National University of Singapore +2 more

Adversarial patch attack for autonomous driving pedestrian detection with novel IoU-based loss and high-resolution transferability technique

Input Manipulation Attack vision
PDF
attack arXiv Aug 4, 2025 · Aug 2025

Towards Stealthy and Effective Backdoor Attacks on Lane Detection: A Naturalistic Data Poisoning Approach

Yifan Liao, Yuxin Cao, Yedi Zhang et al. · Changan Automobile · National University of Singapore +2 more

Diffusion-based backdoor attack on lane detection models using naturalistic triggers with gradient-guided optimal placement

Model Poisoning Data Poisoning Attack vision
PDF
defense arXiv Apr 24, 2026 · 27d ago

Train in Vain: Functionality-Preserving Poisoning to Prevent Unauthorized Use of Code Datasets

Yuan Xiao, Jiaming Wang, Yuchen Chen et al. · Nanjing University · University of New South Wales +3 more

Dataset poisoning defense that injects compilable, functionality-preserving code fragments to degrade CodeLLM training with only 10% contamination

Data Poisoning Attack Training Data Poisoning nlp
PDF