A Theoretical Analysis of Detecting Large Model-Generated Time Series

Motivated by the increasing risks of data misuse and fabrication, we investigate the problem of identifying synthetic time series generated by Time-Series Large Models (TSLMs) in this work. While there are extensive researches on detecting model generated text, we find that these existing methods are not applicable to time series data due to the fundamental modality difference, as time series usually have lower information density and smoother probability distributions than text data, which limit the discriminative power of token-based detectors. To address this issue, we examine the subtle distributional differences between real and model-generated time series and propose the contraction hypothesis, which states that model-generated time series, unlike real ones, exhibit progressively decreasing uncertainty under recursive forecasting. We formally prove this hypothesis under theoretical assumptions on model behavior and time series structure. Model-generated time series exhibit progressively concentrated distributions under recursive forecasting, leading to uncertainty contraction. We provide empirical validation of the hypothesis across diverse datasets. Building on this insight, we introduce the Uncertainty Contraction Estimator (UCE), a white-box detector that aggregates uncertainty metrics over successive prefixes to identify TSLM-generated time series. Extensive experiments on 32 datasets show that UCE consistently outperforms state-of-the-art baselines, offering a reliable and generalizable solution for detecting model-generated time series.

Key Contributions

Introduces the contraction hypothesis: model-generated time series exhibit progressively decreasing uncertainty under recursive forecasting, proven under formal theoretical assumptions
Proposes UCE (Uncertainty Contraction Estimator), a white-box detector that aggregates uncertainty over successive prefixes to identify TSLM-generated time series
Demonstrates that text-based AI-content detectors fail for time series due to lower information density and smoother distributions, motivating a modality-specific approach

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated content detection — identifying synthetic time series produced by Time-Series Large Models. UCE is a novel detection architecture grounded in new theoretical foundations (the contraction hypothesis), not merely an application of existing methods to a new domain. This is output integrity/provenance for a new modality.

Details

Domains

timeseries

Model Types

transformer

Threat Tags

white_boxinference_time

Datasets

32 diverse time series datasets (specific names not given in abstract/LaTeX header)

Applications

2025 1 cit.

Output Integrity Attack

55%