Ruixuan Huang

Papers in Database (1)

defense arXiv Mar 26, 2026 · 13d ago

Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models

Xunguang Wang, Yuguang Zhou, Qingyue Wang et al. · The Hong Kong University of Science and Technology · Zhejiang University of Technology

Real-time monitor that detects adversarial manipulation of LLM chain-of-thought reasoning via step-level analysis and error classification

Prompt Injection Model Denial of Service nlp
PDF