Meng Han

defense arXiv Sep 3, 2025 · Sep 2025

EverTracer: Hunting Stolen Large Language Models via Stealthy and Robust Probabilistic Fingerprint

Zhenhua Xu, Meng Han, Wenpeng Xing · Zhejiang University · GenTel.io

Detects stolen LLMs via memorization-based probabilistic fingerprints that remain stealthy and robust under gray-box API access

Model Theft Model Theft nlp

PDF Code

attack arXiv Sep 4, 2025 · Sep 2025

MEUV: Achieving Fine-Grained Capability Activation in Large Language Models via Mutually Exclusive Unlock Vectors

Xin Tong, Zhi Lin, Jingya Wang et al. · People’s Public Security University of China · Tsinghua University +2 more

Factorizes LLM refusal directions into topic-specific vectors to achieve fine-grained, semantically controlled safety alignment bypass

Prompt Injection nlp

PDF

defense arXiv Aug 31, 2025 · Aug 2025

PREE: Towards Harmless and Adaptive Fingerprint Editing in Large Language Models via Knowledge Prefix Enhancement

Xubin Yue, Zhenhua Xu, Wenpeng Xing et al. · Zhejiang University · GenTel.io +1 more

Embeds ownership fingerprints in LLM parameter offsets via dual-channel knowledge editing, resisting fine-tuning erasure and feature-space defenses

Model Theft Model Theft nlp

PDF

defense arXiv Sep 5, 2025 · Sep 2025

CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor

Zhenhua Xu, Xixiang Zhao, Xubin Yue et al. · Zhejiang University · The Hong Kong Polytechnic University +1 more

Embeds verifiable LLM ownership fingerprints via multi-turn contextual backdoors resistant to perplexity detection and adversarial fine-tuning

Model Theft Model Theft nlp

PDF Code

defense arXiv Aug 31, 2025 · Aug 2025

Unlocking the Effectiveness of LoRA-FP for Seamless Transfer Implantation of Fingerprints in Downstream Models

Zhenhua Xu, Zhaokun Yan, Binhan Xu et al. · Zhejiang University · China Academy of Information and Communications Technology +3 more

Embeds backdoor ownership fingerprints into LoRA adapters for lightweight, transferable LLM IP protection across downstream models

Model Theft Model Theft nlp

PDF Code

attack arXiv Sep 1, 2025 · Sep 2025

Web Fraud Attacks Against LLM-Driven Multi-Agent Systems

Dezhang Kong, Hujin Peng, Yilun Zhang et al. · Zhejiang University · Changsha University of Science and Technology +4 more

Attacks LLM multi-agent systems via manipulated web links using homoglyph, subdirectory, and obfuscation techniques

Insecure Plugin Design Excessive Agency nlp

PDF Code

defense arXiv Aug 14, 2025 · Aug 2025

MCP-Guard: A Multi-Stage Defense-in-Depth Framework for Securing Model Context Protocol in Agentic AI

Wenpeng Xing, Zhonghao Qi, Yupeng Qin et al. · Zhejiang University · Binjiang Institute of Zhejiang University +3 more

Defends LLM-tool MCP interfaces from prompt injection and data exfiltration via a three-stage neural detection pipeline

Insecure Plugin Design Prompt Injection nlp

PDF

survey arXiv Aug 15, 2025 · Aug 2025

Copyright Protection for Large Language Models: A Survey of Methods, Challenges, and Trends

Zhenhua Xu, Xubin Yue, Zhebo Wang et al. · Zhejiang University · GenTel.io

Surveys LLM copyright protection: text watermarking, model fingerprinting, fingerprint transfer/removal, and IP ownership verification

Model Theft Output Integrity Attack Model Theft nlp

PDF Code

Papers in Database (8)