Hongwei Yao

Papers in Database (2)

defense arXiv Aug 8, 2025 · Aug 2025

Quantifying Conversation Drift in MCP via Latent Polytope

Haoran Shi, Hongwei Yao, Shuo Shao et al. · arXiv · Zhejiang University +3 more

Defends LLM-MCP tool integrations against indirect prompt injection by detecting adversarial conversation drift in latent polytope space

Insecure Plugin Design Prompt Injection nlp
PDF
defense arXiv Sep 3, 2025 · Sep 2025

PromptCOS: Towards Content-only System Prompt Copyright Auditing for LLMs

Yuchen Yang, Yiming Li, Hongwei Yao et al. · Zhejiang University · Nanyang Technological University +2 more

Watermarks LLM system prompts with content-only verification to detect prompt theft without requiring access to model logits

Model Theft Sensitive Information Disclosure nlp
PDF Code