Yifan Ding

benchmark arXiv Jan 15, 2026 · 11w ago

Xingjun Ma, Yixu Wang, Hengyuan Xu et al. · Fudan University · Shanghai Innovation Institute +2 more

Benchmarks six frontier LLMs/VLMs on adversarial, multilingual, and compliance safety, revealing all collapse below 6% worst-case safety rates

Prompt Injection nlpmultimodalvisiongenerative

1 citations PDF

benchmark arXiv Jan 8, 2026 · 12w ago

Yunhao Feng, Yige Li, Yutao Wu et al. · Fudan University · Alibaba Group +4 more

Benchmark framework systematizing backdoor attacks across planning, memory, and tool-use stages of LLM agent workflows

Model Poisoning Excessive Agency nlpmultimodal

1 citations PDF Code

attack arXiv Oct 11, 2025 · Oct 2025

Yutao Wu, Xiao Liu, Yinghui Li et al. · Deakin University · Fudan University +1 more

Poisons RAG knowledge bases with few adversarial documents to flip LLM fact-checking decisions at 86% ASR, black-box and transfer-robust.

Data Poisoning Attack Prompt Injection nlp

Papers in Database (3)