defense 2025

TrajSyn: Privacy-Preserving Dataset Distillation from Federated Model Trajectories for Server-Side Adversarial Training

Mukur Gupta 1, Niharika Gupta 2, Saifur Rahman 3, Shantanu Pal 3, Chandan Karmakar 3

0 citations · 35 references · arXiv

α

Published on arXiv

2512.15123

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

TrajSyn enables effective server-side adversarial training in FL by distilling a synthetic dataset from client model weight trajectories, consistently improving adversarial robustness with no additional client-side compute cost.

TrajSyn

Novel technique introduced


Deep learning models deployed on edge devices are increasingly used in safety-critical applications. However, their vulnerability to adversarial perturbations poses significant risks, especially in Federated Learning (FL) settings where identical models are distributed across thousands of clients. While adversarial training is a strong defense, it is difficult to apply in FL due to strict client-data privacy constraints and the limited compute available on edge devices. In this work, we introduce TrajSyn, a privacy-preserving framework that enables effective server-side adversarial training by synthesizing a proxy dataset from the trajectories of client model updates, without accessing raw client data. We show that TrajSyn consistently improves adversarial robustness on image classification benchmarks with no extra compute burden on the client device.


Key Contributions

  • TrajSyn framework that synthesizes a privacy-preserving proxy dataset from client model update trajectories (without accessing raw client data), enabling server-side adversarial training in federated learning
  • Demonstration that server-side adversarial training on synthesized trajectory-derived data consistently improves adversarial robustness on image classification benchmarks
  • Zero additional computational overhead on client devices — the adversarial training burden is fully offloaded to the server

🛡️ Threat Analysis

Input Manipulation Attack

The paper's explicit goal is defending against adversarial input perturbations in FL-deployed models. TrajSyn enables adversarial training (the primary defense against ML01 attacks) by synthesizing a proxy dataset from model update trajectories, allowing the server to perform adversarially augmented fine-tuning without client data. The entire framework is motivated by the threat of gradient-based adversarial examples attacking shared edge-deployed models.


Details

Domains
visionfederated-learning
Model Types
cnnfederated
Threat Tags
white_boxinference_time
Datasets
image classification benchmarks (specific names not specified in available text)
Applications
image classificationfederated learning on edge devices