Dara Bahri

defense arXiv Aug 18, 2025 · Aug 2025

Dara Bahri, John Wieting · Google DeepMind

Hybrid LLM text detection combining watermark and classifier signals boosts accuracy from 75% to 95% on low-entropy prompts

Output Integrity Attack nlp

Papers in Database (1)