Variation is the Key: A Variation-Based Framework for LLM-Generated Text Detection
Xuecong Li , Xiaohong Li , Qiang Hu , Yao Zhang , Junjie Wang
Published on arXiv
2602.13226
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
VaryBalance outperforms state-of-the-art Binoculars by up to 34.5% overall AUROC on formal writing contexts across eight datasets and five LLMs
VaryBalance
Novel technique introduced
Detecting text generated by large language models (LLMs) is crucial but challenging. Existing detectors depend on impractical assumptions, such as white-box settings, or solely rely on text-level features, leading to imprecise detection ability. In this paper, we propose a simple but effective and practical LLM-generated text detection method, VaryBalance. The core of VaryBalance is that, compared to LLM-generated texts, there is a greater difference between human texts and their rewritten version via LLMs. Leveraging this observation, VaryBalance quantifies this through mean standard deviation and distinguishes human texts and LLM-generated texts. Comprehensive experiments demonstrated that VaryBalance outperforms the state-of-the-art detectors, i.e., Binoculars, by up to 34.3\% in terms of AUROC, and maintains robustness against multiple generating models and languages.
Key Contributions
- Empirical observation that human texts exhibit greater log-perplexity variation across LLM rewrites than LLM-generated texts do
- VaryBalance: a black-box detector that uses mean standard deviation of rewritten-text log perplexities to distinguish human vs. LLM-generated text
- Extended scoring variant for short or stylistically diverse social media text; outperforms Binoculars by up to 34.3% AUROC across eight datasets and five models
🛡️ Threat Analysis
VaryBalance is an AI-generated content detector that distinguishes human-written from LLM-generated text — directly addressing output integrity and content provenance, the core concern of ML09.