Kai Chen

defense arXiv Mar 19, 2026 · 18d ago

CNT: Safety-oriented Function Reuse across LLMs via Cross-Model Neuron Transfer

Yue Zhao, Yujia Gong, Ruigang Liang et al. · Chinese Academy of Sciences · Beijing University of Posts and Telecommunications +1 more

Transfers safety functionality between LLMs by transplanting minimal neuron subsets, enabling alignment enhancement and jailbreak defense without retraining

Prompt Injection nlp

PDF

attack arXiv Aug 27, 2025 · Aug 2025

The Art of Hide and Seek: Making Pickle-Based Model Supply Chain Poisoning Stealthy Again

Tong Liu, Guozhu Meng, Peng Zhou et al. · Chinese Academy of Sciences · University of Chinese Academy of Sciences +2 more

Reveals 22 pickle model loading attack paths and 133 gadgets that bypass all SOTA supply chain scanners on HuggingFace

AI Supply Chain Attacks

PDF

defense arXiv Jan 9, 2025 · Jan 2025

RAG-WM: An Efficient Black-Box Watermarking Approach for Retrieval-Augmented Generation of Large Language Models

Peizhuo Lv, Mengjie Sun, Hao Wang et al. · Chinese Academy of Sciences · Shandong University +2 more

Embeds 'knowledge watermarks' into RAG document stores to detect IP theft of retrieval-augmented LLM systems via black-box querying

Model Theft nlp

PDF

Papers in Database (3)

CNT: Safety-oriented Function Reuse across LLMs via Cross-Model Neuron Transfer

The Art of Hide and Seek: Making Pickle-Based Model Supply Chain Poisoning Stealthy Again

RAG-WM: An Efficient Black-Box Watermarking Approach for Retrieval-Augmented Generation of Large Language Models