Latest papers

2 papers
defense arXiv Feb 5, 2026 · 8w ago

Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening

Zhenxiong Yu, Zhi Yang, Zhiheng Jin et al. · SUFE · NUS +5 more

Event-driven LLM agent defense that selectively triggers hierarchical screening against prompt injection and multi-stage agent attacks

Prompt Injection Excessive Agency nlp
PDF Code
benchmark arXiv Dec 15, 2025 · Dec 2025

Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans?

Jiaqi Wang, Weijia Wu, Yi Zhan et al. · CUHK · NUS +2 more

Benchmark revealing VLMs barely exceed chance at detecting AI-generated ASMR videos, far below human expert accuracy

Output Integrity Attack visionaudiomultimodal
1 citations PDF Code