SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis
Haitong Luo 1,2, Weiyao Zhang 1, Suhang Wang 3, Wenji Zou 1,2, Chungang Lin 1,2, Xuying Meng 1,4, Yujun Zhang 1,5,2
Published on arXiv
2508.11343
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
SpecDetect outperforms the state-of-the-art training-free LLM text detector while running in nearly half the time, using a single DFT total energy feature
SpecDetect / SpecDetect++
Novel technique introduced
The proliferation of high-quality text from Large Language Models (LLMs) demands reliable and efficient detection methods. While existing training-free approaches show promise, they often rely on surface-level statistics and overlook fundamental signal properties of the text generation process. In this work, we reframe detection as a signal processing problem, introducing a novel paradigm that analyzes the sequence of token log-probabilities in the frequency domain. By systematically analyzing the signal's spectral properties using the global Discrete Fourier Transform (DFT) and the local Short-Time Fourier Transform (STFT), we find that human-written text consistently exhibits significantly higher spectral energy. This higher energy reflects the larger-amplitude fluctuations inherent in human writing compared to the suppressed dynamics of LLM-generated text. Based on this key insight, we construct SpecDetect, a detector built on a single, robust feature from the global DFT: DFT total energy. We also propose an enhanced version, SpecDetect++, which incorporates a sampling discrepancy mechanism to further boost robustness. Extensive experiments show that our approach outperforms the state-of-the-art model while running in nearly half the time. Our work introduces a new, efficient, and interpretable pathway for LLM-generated text detection, showing that classical signal processing techniques offer a surprisingly powerful solution to this modern challenge.
Key Contributions
- Novel frequency-domain paradigm for LLM text detection: treats token log-probability sequences as discrete-time signals and applies DFT/STFT to extract spectral features
- SpecDetect: a training-free detector based on a single hyperparameter-free feature (DFT total energy) that captures suppressed spectral dynamics in LLM-generated text
- SpecDetect++: enhanced variant incorporating a sampling discrepancy mechanism, improving robustness while running at nearly half the runtime of state-of-the-art methods
🛡️ Threat Analysis
Proposes a novel AI-generated text detection method (SpecDetect/SpecDetect++) that analyzes output token log-probability sequences in the frequency domain — directly addresses output integrity and content provenance by distinguishing LLM-generated from human-written text.