Tackling Federated Unlearning as a Parameter Estimation Problem
Antonio Balordi , Lorenzo Manini , Fabio Stella , Alessio Merlo
Published on arXiv
2508.19065
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
Achieves MIA success near random chance and ~0.9 normalized accuracy relative to full retraining, while also neutralizing targeted backdoor attacks as a secondary benefit.
Privacy regulations require the erasure of data from deep learning models. This is a significant challenge that is amplified in Federated Learning, where data remains on clients, making full retraining or coordinated updates often infeasible. This work introduces an efficient Federated Unlearning framework based on information theory, modeling leakage as a parameter estimation problem. Our method uses second-order Hessian information to identify and selectively reset only the parameters most sensitive to the data being forgotten, followed by minimal federated retraining. This model-agnostic approach supports categorical and client unlearning without requiring server access to raw client data after initial information aggregation. Evaluations on benchmark datasets demonstrate strong privacy (MIA success near random, categorical knowledge erased) and high performance (Normalized Accuracy against re-trained benchmarks of $\approx$ 0.9), while aiming for increased efficiency over complete retraining. Furthermore, in a targeted backdoor attack scenario, our framework effectively neutralizes the malicious trigger, restoring model integrity. This offers a practical solution for data forgetting in FL.
Key Contributions
- Information-theoretic federated unlearning framework that models data leakage as a parameter estimation problem using second-order Hessian information
- Selective parameter resetting targeting only the parameters most sensitive to forgotten data, followed by minimal federated retraining — avoiding full model retrain
- Demonstrated effectiveness for both categorical and client unlearning, with MIA success near random and ~0.9 normalized accuracy vs. retrained baseline; also shown to neutralize targeted backdoor triggers
🛡️ Threat Analysis
The paper's security evaluation centers on membership inference attack (MIA) success rate as the primary privacy metric, demonstrating the unlearning method reduces MIA success to near-random chance — directly targeting the MIA threat model for unlearned data points.