Yuansen Zhang

h-index: 1 2 citations 3 papers (total)

Papers in Database (2)

benchmark arXiv Nov 8, 2025 · Nov 2025

Can LLM Infer Risk Information From MCP Server System Logs?

Jiayi Fu, Yuansen Zhang, Yinggui Wang · Southern University of Science and Technology · Ant Group

Benchmark dataset and fine-tuning approach for training LLMs to detect malicious MCP server risks from system logs

Insecure Plugin Design nlp
PDF Code
defense arXiv Dec 18, 2025 · Dec 2025

Prefix Probing: Lightweight Harmful Content Detection for Large Language Models

Jirui Yang, Hengqi Guo, Zhihui Lu et al. · Fudan University · Ant Group +1 more

Defends LLMs against harmful prompts by comparing refusal vs. agreement prefix log-probabilities with near-zero inference overhead

Prompt Injection nlp
PDF