Latest papers

2 papers
defense arXiv Dec 22, 2025 · Dec 2025

HATS: High-Accuracy Triple-Set Watermarking for Large Language Models

Zhiqing Hu, Chenxu Zhao, Jiazhong Lu et al. · China Academy of Engineering Physics · National Interdisciplinary Research Center of Engineering Physics +1 more

Triple-set vocabulary watermark for LLM text achieves higher detection accuracy than binary KGW while preserving readability

Output Integrity Attack nlp
PDF
attack arXiv Nov 11, 2025 · Nov 2025

LoopLLM: Transferable Energy-Latency Attacks in LLMs via Repetitive Generation

Xingyu Li, Xiaolei Liu, Cheng Liu et al. · National Interdisciplinary Research Center of Engineering Physics · Institute of Computer Application +2 more

Gradient-based adversarial prompt attack forces LLMs into repetitive loops, exhausting compute resources up to max output length

Model Denial of Service nlp
4 citations 2 influentialPDF Code