Yiming Li

Papers in Database (7)

defense arXiv Sep 3, 2025 · Sep 2025

PromptCOS: Towards Content-only System Prompt Copyright Auditing for LLMs

Yuchen Yang, Yiming Li, Hongwei Yao et al. · Zhejiang University · Nanyang Technological University +2 more

Watermarks LLM system prompts with content-only verification to detect prompt theft without requiring access to model logits

Model Theft Sensitive Information Disclosure nlp
PDF Code
benchmark arXiv Aug 27, 2025 · Aug 2025

SoK: Large Language Model Copyright Auditing via Fingerprinting

Shuo Shao, Yiming Li, Yu He et al. · Zhejiang University · Nanyang Technological University +3 more

Surveys LLM fingerprinting for copyright auditing and benchmarks 13 post-development robustness techniques across 149 model instances

Model Theft Model Theft nlp
PDF Code
defense arXiv Oct 8, 2025 · Oct 2025

Reading Between the Lines: Towards Reliable Black-box LLM Fingerprinting via Zeroth-order Gradient Estimation

Shuo Shao, Yiming Li, Hongwei Yao et al. · Zhejiang University · Nanyang Technological University +1 more

Fingerprints LLMs in black-box settings via zeroth-order Jacobian estimation to detect stolen or illicitly copied models

Model Theft Model Theft nlp
PDF Code
defense arXiv Mar 11, 2026 · 26d ago

AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations

Yu He, Haozhe Zhu, Yiming Li et al. · Zhejiang University · Nanyang Technological University +1 more

Runtime defense for LLM agents detecting indirect prompt injection via causal counterfactual analysis of tool invocations

Prompt Injection nlp
PDF Code
defense arXiv Aug 4, 2025 · Aug 2025

Coward: Collision-based Watermark for Proactive Federated Backdoor Detection

Wenjie Li, Siying Gu, Yiming Li et al. · Tsinghua University · East China Normal University +1 more

Defends federated learning against backdoor attacks using multi-backdoor collision effects to create a server-injected detection watermark

Model Poisoning federated-learningvision
PDF Code
attack arXiv Aug 9, 2025 · Aug 2025

Towards Effective Prompt Stealing Attack against Text-to-Image Diffusion Models

Shiqian Zhao, Chong Wang, Yiming Li et al. · Nanyang Technological University · National University of Singapore +2 more

Reverse-engineers valuable user prompts from T2I showcase images by interacting with a local proxy diffusion model

Model Theft Sensitive Information Disclosure visionnlpgenerative
PDF
defense arXiv Aug 12, 2025 · Aug 2025

Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems

Yutong Wu, Jie Zhang, Yiming Li et al. · Nanyang Technological University · Technology and Research +2 more

Proposes Cowpox, a distributed cure-sample defense immunizing VLM multi-agent systems against propagating jailbreak infections

Prompt Injection Excessive Agency multimodalnlp
PDF Code