defense 2025

A Theoretical Analysis of Detecting Large Model-Generated Time Series

Junji Hou , Junzhou Zhao , Shuo Zhang , Pinghui Wang

2 citations · 38 references · arXiv

α

Published on arXiv

2511.07104

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

UCE consistently outperforms state-of-the-art baselines across 32 datasets for detecting model-generated time series, validated by both theoretical proof and empirical experiments

UCE (Uncertainty Contraction Estimator)

Novel technique introduced


Motivated by the increasing risks of data misuse and fabrication, we investigate the problem of identifying synthetic time series generated by Time-Series Large Models (TSLMs) in this work. While there are extensive researches on detecting model generated text, we find that these existing methods are not applicable to time series data due to the fundamental modality difference, as time series usually have lower information density and smoother probability distributions than text data, which limit the discriminative power of token-based detectors. To address this issue, we examine the subtle distributional differences between real and model-generated time series and propose the contraction hypothesis, which states that model-generated time series, unlike real ones, exhibit progressively decreasing uncertainty under recursive forecasting. We formally prove this hypothesis under theoretical assumptions on model behavior and time series structure. Model-generated time series exhibit progressively concentrated distributions under recursive forecasting, leading to uncertainty contraction. We provide empirical validation of the hypothesis across diverse datasets. Building on this insight, we introduce the Uncertainty Contraction Estimator (UCE), a white-box detector that aggregates uncertainty metrics over successive prefixes to identify TSLM-generated time series. Extensive experiments on 32 datasets show that UCE consistently outperforms state-of-the-art baselines, offering a reliable and generalizable solution for detecting model-generated time series.


Key Contributions

  • Introduces the contraction hypothesis: model-generated time series exhibit progressively decreasing uncertainty under recursive forecasting, proven under formal theoretical assumptions
  • Proposes UCE (Uncertainty Contraction Estimator), a white-box detector that aggregates uncertainty over successive prefixes to identify TSLM-generated time series
  • Demonstrates that text-based AI-content detectors fail for time series due to lower information density and smoother distributions, motivating a modality-specific approach

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated content detection — identifying synthetic time series produced by Time-Series Large Models. UCE is a novel detection architecture grounded in new theoretical foundations (the contraction hypothesis), not merely an application of existing methods to a new domain. This is output integrity/provenance for a new modality.


Details

Domains
timeseries
Model Types
transformer
Threat Tags
white_boxinference_time
Datasets
32 diverse time series datasets (specific names not given in abstract/LaTeX header)
Applications
time series generation detectionsynthetic data detectiondata fabrication detection