Zhaoyi Zhang

Papers in Database (1)

defense arXiv Sep 18, 2025 · Sep 2025

Adversarial Distilled Retrieval-Augmented Guarding Model for Online Malicious Intent Detection

Yihao Guo, Haocheng Bian, Liutong Zhou et al. · Apple · Cohere +3 more

Builds a compact 149M-parameter RAG-augmented guard model that detects malicious LLM prompts in real time with GPT-4-level accuracy

Prompt Injection nlp
PDF