Yelong Shen

attack arXiv Oct 24, 2025 · Oct 2025

Kieu Dang, Phung Lai, NhatHai Phan et al. · University at Albany · New Jersey Institute of Technology +2 more

LDP noise injection during fine-tuning steals LLM behavior from APIs while evading watermark detectors, achieving 96.95% attack success rate

Model Theft Output Integrity Attack Model Theft nlp

2 citations PDF Code

Papers in Database (1)