Weizheng Gu

Papers in Database (1)

defense arXiv Aug 11, 2025 · Aug 2025

SAEMark: Steering Personalized Multilingual LLM Watermarks with Sparse Autoencoders

Zhuohao Yu, Xingru Jiang, Weizheng Gu et al. · Peking University

Black-box LLM text watermarking via sparse autoencoder feature-based rejection sampling, enabling multilingual content attribution without logit access

Output Integrity Attack nlp
PDF