Shuai Wang

Papers in Database (4)

defense arXiv Feb 9, 2026 · 8w ago

On Protecting Agentic Systems' Intellectual Property via Watermarking

Liwen Wang, Zongjie Li, Yuchong Xie et al. · The Hong Kong University of Science and Technology · HSBC

Watermarks agentic LLM systems by biasing tool execution paths, so stolen imitation models inherit detectable signatures

Model Theft Model Theft nlp
PDF
attack arXiv Aug 27, 2025 · Aug 2025

Disabling Self-Correction in Retrieval-Augmented Generation via Stealthy Retriever Poisoning

Yanbo Dai, Zhenlan Ji, Zongjie Li et al. · The Hong Kong University of Science and Technology

Backdoors RAG retrievers via model editing to inject anti-self-correction instructions, achieving >90% attack success across 6 LLMs

Model Poisoning Prompt Injection nlp
PDF
defense arXiv Mar 26, 2026 · 11d ago

Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models

Xunguang Wang, Yuguang Zhou, Qingyue Wang et al. · The Hong Kong University of Science and Technology · Zhejiang University of Technology

Real-time monitor that detects adversarial manipulation of LLM chain-of-thought reasoning via step-level analysis and error classification

Prompt Injection Model Denial of Service nlp
PDF
attack arXiv Sep 6, 2025 · Sep 2025

Red-Teaming Coding Agents from a Tool-Invocation Perspective: An Empirical Security Assessment

Yuchong Xie, Mingyu Luo, Zesen Liu et al. · The Hong Kong University of Science and Technology · Fudan University

Red-teams six coding agents via tool-invocation prompt injection and ToolLeak, achieving RCE and system prompt exfiltration across all tested agents

Prompt Injection Sensitive Information Disclosure Insecure Plugin Design nlp
PDF Code