α

Published on arXiv

2603.20108

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

Released public benchmark with 45 backdoored forecasting models and evaluation framework for trigger identification research


Forecasting plays a crucial role in modern safety-critical applications, such as space operations. However, the increasing use of deep forecasting models introduces a new security risk of trojan horse attacks, carried out by hiding a backdoor in the training data or directly in the model weights. Once implanted, the backdoor is activated by a specific trigger pattern at test time, causing the model to produce manipulated predictions. We focus on this issue in our \textit{Trojan Horse Hunt} data science competition, where more than 200 teams faced the task of identifying triggers hidden in deep forecasting models for spacecraft telemetry. We describe the novel task formulation, benchmark set, evaluation protocol, and best solutions from the competition. We further summarize key insights and research directions for effective identification of triggers in time series forecasting models. All materials are publicly available on the official competition webpage https://www.kaggle.com/competitions/trojan-horse-hunt-in-space.


Key Contributions

  • Novel competition task formulation for trigger reconstruction in time series forecasting models
  • Benchmark dataset of 46 N-HiTS models (1 clean, 45 trojaned) for spacecraft telemetry
  • Evaluation protocol and baseline algorithm inspired by Neural Cleanse for trigger detection
  • Summary of top solutions from 200+ competing teams and research directions

🛡️ Threat Analysis

Model Poisoning

Paper focuses on trojan horse/backdoor attacks in forecasting models where triggers activate malicious behavior. The competition task is reverse-engineering hidden triggers from poisoned models — this is backdoor detection, the core defense problem for ML10.


Details

Domains
timeseries
Model Types
traditional_ml
Threat Tags
training_timetargeted
Datasets
spacecraft telemetry data46 N-HiTS models benchmark
Applications
spacecraft telemetry forecastingspace operations