Enhanced Privacy Leakage from Noise-Perturbed Gradients via Gradient-Guided Conditional Diffusion Models

Federated learning synchronizes models through gradient transmission and aggregation. However, these gradients pose significant privacy risks, as sensitive training data is embedded within them. Existing gradient inversion attacks suffer from significantly degraded reconstruction performance when gradients are perturbed by noise-a common defense mechanism. In this paper, we introduce gradient-guided conditional diffusion models for reconstructing private images from leaked gradients, without prior knowledge of the target data distribution. Our approach leverages the inherent denoising capability of diffusion models to circumvent the partial protection offered by noise perturbation, thereby improving attack performance under such defenses. We further provide a theoretical analysis of the reconstruction error bounds and the convergence properties of the attack loss, characterizing the impact of key factors-such as noise magnitude and attacked model architecture-on reconstruction quality. Extensive experiments demonstrate our attack's superior reconstruction performance with Gaussian noise-perturbed gradients, and confirm our theoretical findings.

Key Contributions

Gradient-guided Conditional Diffusion Model (GG-CDM) attack that reconstructs high-fidelity private images from noise-perturbed gradients without prior knowledge of the target data distribution
Theoretical analysis deriving reconstruction error bounds and convergence properties of the attack loss, and introduction of the RV metric to quantify a model's intrinsic vulnerability to gradient inversion attacks
Extensive experiments demonstrating superior reconstruction performance over existing GIAs under Gaussian noise-perturbation defenses

🛡️ Threat Analysis

Model Inversion Attack

The paper's primary contribution is a gradient inversion attack: an adversary reconstructs private training images from shared federated learning gradients. Using diffusion models' denoising capability to overcome noise perturbation defenses is a novel technique for training data reconstruction — the canonical ML03 threat.

Details

Domains

visionfederated-learning

Model Types

diffusionfederatedcnn

Threat Tags

white_boxtraining_time

Applications

2026 0 cit.

Model Inversion Attack

79%