AI-Generated Text is Non-Stationary: Detection via Temporal Tomography

The field of AI-generated text detection has evolved from supervised classification to zero-shot statistical analysis. However, current approaches share a fundamental limitation: they aggregate token-level measurements into scalar scores, discarding positional information about where anomalies occur. Our empirical analysis reveals that AI-generated text exhibits significant non-stationarity, statistical properties vary by 73.8\% more between text segments compared to human writing. This discovery explains why existing detectors fail against localized adversarial perturbations that exploit this overlooked characteristic. We introduce Temporal Discrepancy Tomography (TDT), a novel detection paradigm that preserves positional information by reformulating detection as a signal processing task. TDT treats token-level discrepancies as a time-series signal and applies Continuous Wavelet Transform to generate a two-dimensional time-scale representation, capturing both the location and linguistic scale of statistical anomalies. On the RAID benchmark, TDT achieves 0.855 AUROC (7.1\% improvement over the best baseline). More importantly, TDT demonstrates robust performance on adversarial tasks, with 14.1\% AUROC improvement on HART Level 2 paraphrasing attacks. Despite its sophisticated analysis, TDT maintains practical efficiency with only 13\% computational overhead. Our work establishes non-stationarity as a fundamental characteristic of AI-generated text and demonstrates that preserving temporal dynamics is essential for robust detection.

Key Contributions

Empirically establishes that AI-generated text is fundamentally non-stationary, with inter-segment statistical variation 73.8% larger than human text — explaining why scalar-score detectors fail under localized adversarial perturbations.
Introduces Temporal Discrepancy Tomography (TDT), which applies Continuous Wavelet Transform to token-level discrepancy sequences to produce a 2D time-scale representation capturing anomaly location and linguistic scale.
Achieves 0.855 AUROC on RAID benchmark (+7.1% over best baseline) and +14.1% AUROC on HART Level 2 paraphrasing attacks with only 13% computational overhead.

🛡️ Threat Analysis

Output Integrity Attack

Primary contribution is a novel AI-generated text detection paradigm (TDT) that uses Continuous Wavelet Transform on token-level discrepancies to identify machine-generated content, including robustness against adversarial paraphrasing attacks — squarely output integrity / AI content detection.

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

inference_timeblack_box

Datasets

RAIDHART

Applications

2026 0 cit.

Output Integrity Attack

100%

AI-Generated Text is Non-Stationary: Detection via Temporal Tomography

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Every Language Model Has a Forgery-Resistant Signature

SimKey: A Semantically Aware Key Module for Watermarking Language Models

SENTRA: Selected-Next-Token Transformer for LLM Text Detection

Human Texts Are Outliers: Detecting LLM-generated Texts via Out-of-distribution Detection

Black-box Detection of LLM-generated Text Using Generalized Jensen-Shannon Divergence

SearchLLM: Detecting LLM Paraphrased Text by Measuring the Similarity with Regeneration of the Candidate Source via Search Engine

IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation

Variation is the Key: A Variation-Based Framework for LLM-Generated Text Detection