tool 2025

SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis

0 citations

Published on arXiv

2508.11343

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

SpecDetect outperforms the state-of-the-art training-free LLM text detector while running in nearly half the time, using a single DFT total energy feature

SpecDetect / SpecDetect++

Novel technique introduced

The proliferation of high-quality text from Large Language Models (LLMs) demands reliable and efficient detection methods. While existing training-free approaches show promise, they often rely on surface-level statistics and overlook fundamental signal properties of the text generation process. In this work, we reframe detection as a signal processing problem, introducing a novel paradigm that analyzes the sequence of token log-probabilities in the frequency domain. By systematically analyzing the signal's spectral properties using the global Discrete Fourier Transform (DFT) and the local Short-Time Fourier Transform (STFT), we find that human-written text consistently exhibits significantly higher spectral energy. This higher energy reflects the larger-amplitude fluctuations inherent in human writing compared to the suppressed dynamics of LLM-generated text. Based on this key insight, we construct SpecDetect, a detector built on a single, robust feature from the global DFT: DFT total energy. We also propose an enhanced version, SpecDetect++, which incorporates a sampling discrepancy mechanism to further boost robustness. Extensive experiments show that our approach outperforms the state-of-the-art model while running in nearly half the time. Our work introduces a new, efficient, and interpretable pathway for LLM-generated text detection, showing that classical signal processing techniques offer a surprisingly powerful solution to this modern challenge.

Key Contributions

Novel frequency-domain paradigm for LLM text detection: treats token log-probability sequences as discrete-time signals and applies DFT/STFT to extract spectral features
SpecDetect: a training-free detector based on a single hyperparameter-free feature (DFT total energy) that captures suppressed spectral dynamics in LLM-generated text
SpecDetect++: enhanced variant incorporating a sampling discrepancy mechanism, improving robustness while running at nearly half the runtime of state-of-the-art methods

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel AI-generated text detection method (SpecDetect/SpecDetect++) that analyzes output token log-probability sequences in the frequency domain — directly addresses output integrity and content provenance by distinguishing LLM-generated from human-written text.

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

inference_time

Applications

llm-generated text detectionacademic integritymisinformation detection

Read PDF arXiv Code

SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Prompt-Induced Linguistic Fingerprints for LLM-Generated Fake News Detection

Detecting LLM-Generated Text with Performance Guarantees

Multi-Hierarchical Feature Detection for Large Language Model Generated Text

SOGPTSpotter: Detecting ChatGPT-Generated Answers on Stack Overflow

On the Effectiveness of LLM-Specific Fine-Tuning for Detecting AI-Generated Text

EMMM, Explain Me My Model! Explainable Machine Generated Text Detection in Dialogues

NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution

Diversity Boosts AI-Generated Text Detection