tool 2025

SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis

Haitong Luo 1,2, Weiyao Zhang 1, Suhang Wang 3, Wenji Zou 1,2, Chungang Lin 1,2, Xuying Meng 1,4, Yujun Zhang 1,5,2

0 citations

α

Published on arXiv

2508.11343

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

SpecDetect outperforms the state-of-the-art training-free LLM text detector while running in nearly half the time, using a single DFT total energy feature

SpecDetect / SpecDetect++

Novel technique introduced


The proliferation of high-quality text from Large Language Models (LLMs) demands reliable and efficient detection methods. While existing training-free approaches show promise, they often rely on surface-level statistics and overlook fundamental signal properties of the text generation process. In this work, we reframe detection as a signal processing problem, introducing a novel paradigm that analyzes the sequence of token log-probabilities in the frequency domain. By systematically analyzing the signal's spectral properties using the global Discrete Fourier Transform (DFT) and the local Short-Time Fourier Transform (STFT), we find that human-written text consistently exhibits significantly higher spectral energy. This higher energy reflects the larger-amplitude fluctuations inherent in human writing compared to the suppressed dynamics of LLM-generated text. Based on this key insight, we construct SpecDetect, a detector built on a single, robust feature from the global DFT: DFT total energy. We also propose an enhanced version, SpecDetect++, which incorporates a sampling discrepancy mechanism to further boost robustness. Extensive experiments show that our approach outperforms the state-of-the-art model while running in nearly half the time. Our work introduces a new, efficient, and interpretable pathway for LLM-generated text detection, showing that classical signal processing techniques offer a surprisingly powerful solution to this modern challenge.


Key Contributions

  • Novel frequency-domain paradigm for LLM text detection: treats token log-probability sequences as discrete-time signals and applies DFT/STFT to extract spectral features
  • SpecDetect: a training-free detector based on a single hyperparameter-free feature (DFT total energy) that captures suppressed spectral dynamics in LLM-generated text
  • SpecDetect++: enhanced variant incorporating a sampling discrepancy mechanism, improving robustness while running at nearly half the runtime of state-of-the-art methods

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel AI-generated text detection method (SpecDetect/SpecDetect++) that analyzes output token log-probability sequences in the frequency domain — directly addresses output integrity and content provenance by distinguishing LLM-generated from human-written text.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
inference_time
Applications
llm-generated text detectionacademic integritymisinformation detection