Hongwei Yao

Papers in Database (3)

defense arXiv Oct 8, 2025 · Oct 2025

Reading Between the Lines: Towards Reliable Black-box LLM Fingerprinting via Zeroth-order Gradient Estimation

Shuo Shao, Yiming Li, Hongwei Yao et al. · Zhejiang University · Nanyang Technological University +1 more

Fingerprints LLMs in black-box settings via zeroth-order Jacobian estimation to detect stolen or illicitly copied models

Model Theft Model Theft nlp
PDF Code
benchmark arXiv Aug 27, 2025 · Aug 2025

SoK: Large Language Model Copyright Auditing via Fingerprinting

Shuo Shao, Yiming Li, Yu He et al. · Zhejiang University · Nanyang Technological University +3 more

Surveys LLM fingerprinting for copyright auditing and benchmarks 13 post-development robustness techniques across 149 model instances

Model Theft Model Theft nlp
PDF Code
defense arXiv Mar 11, 2026 · 26d ago

AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations

Yu He, Haozhe Zhu, Yiming Li et al. · Zhejiang University · Nanyang Technological University +1 more

Runtime defense for LLM agents detecting indirect prompt injection via causal counterfactual analysis of tool invocations

Prompt Injection nlp
PDF Code