attack 2025

Gradient Inversion in Federated Reinforcement Learning

Shenghong He

0 citations · 27 references · arXiv

α

Published on arXiv

2512.00303

Model Inversion Attack

OWASP ML Top 10 — ML03

Key Finding

RGIA successfully reconstructs private state-action-reward-next-state tuples in FRL by enforcing transition-dynamics regularization, overcoming the pseudo-solution problem that defeats standard gradient inversion methods.

RGIA (Regularization Gradient Inversion Attack)

Novel technique introduced


Federated reinforcement learning (FRL) enables distributed learning of optimal policies while preserving local data privacy through gradient sharing.However, FRL faces the risk of data privacy leaks, where attackers exploit shared gradients to reconstruct local training data.Compared to traditional supervised federated learning, successful reconstruction in FRL requires the generated data not only to match the shared gradients but also to align with real transition dynamics of the environment (i.e., aligning with the real data transition distribution).To address this issue, we propose a novel attack method called Regularization Gradient Inversion Attack (RGIA), which enforces prior-knowledge-based regularization on states, rewards, and transition dynamics during the optimization process to ensure that the reconstructed data remain close to the true transition distribution.Theoretically, we prove that the prior-knowledge-based regularization term narrows the solution space from a broad set containing spurious solutions to a constrained subset that satisfies both gradient matching and true transition dynamics.Extensive experiments on control tasks and autonomous driving tasks demonstrate that RGIA can effectively constrain reconstructed data transition distributions and thus successfully reconstruct local private data.


Key Contributions

  • Identifies the pseudo-solution problem unique to FRL gradient inversion: reconstructed data must match gradients AND conform to true environment transition dynamics
  • Proposes RGIA, which enforces prior-knowledge-based regularization over states, rewards, and transition dynamics to constrain the solution space to physically valid trajectories
  • Theoretical proof that the regularization term narrows the solution space from a broad spurious set to a constrained subset satisfying both gradient matching and true transition dynamics

🛡️ Threat Analysis

Model Inversion Attack

The paper's primary contribution is an adversarial attack (RGIA) in which an attacker reconstructs private local training data (state, action, reward, next-state tuples) from shared gradients in federated reinforcement learning — a direct gradient leakage / data reconstruction threat model.


Details

Domains
reinforcement-learningfederated-learning
Model Types
rlfederated
Threat Tags
white_boxtraining_time
Applications
federated reinforcement learningautonomous drivingcontrol tasks