Xuan Gong

Papers in Database (1)

attack arXiv Apr 13, 2026 · 2d ago

RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience

Hanbo Huang, Xuan Gong, Yiran Zhang et al. · Shanghai Jiao Tong University

RL-based black-box attack that spoofs LLM watermarks with 62% success using only 100 training pairs, no detector access

Output Integrity Attack nlp
PDF