Lightweight and Robust Federated Data Valuation

Federated learning (FL) faces persistent robustness challenges due to non-IID data distributions and adversarial client behavior. A promising mitigation strategy is contribution evaluation, which enables adaptive aggregation by quantifying each client's utility to the global model. However, state-of-the-art Shapley-value-based approaches incur high computational overhead due to repeated model reweighting and inference, which limits their scalability. We propose FedIF, a novel FL aggregation framework that leverages trajectory-based influence estimation to efficiently compute client contributions. FedIF adapts decentralized FL by introducing normalized and smoothed influence scores computed from lightweight gradient operations on client updates and a public validation set. Theoretical analysis demonstrates that FedIF yields a tighter bound on one-step global loss change under noisy conditions. Extensive experiments on CIFAR-10 and Fashion-MNIST show that FedIF achieves robustness comparable to or exceeding SV-based methods in the presence of label noise, gradient noise, and adversarial samples, while reducing aggregation overhead by up to 450x. Ablation studies confirm the effectiveness of FedIF's design choices, including local weight normalization and influence smoothing. Our results establish FedIF as a practical, theoretically grounded, and scalable alternative to Shapley-value-based approaches for efficient and robust FL in real-world deployments.

Key Contributions

FedIF: a trajectory-based influence estimation framework for FL aggregation that replaces expensive Shapley-value reweighting with lightweight gradient-based client contribution scores
Normalized and smoothed influence scores that provide a tighter theoretical bound on one-step global loss change under noisy/adversarial conditions
Up to 450x reduction in aggregation overhead vs. SV-based methods while matching or exceeding robustness against label noise, gradient noise, and adversarial samples

🛡️ Threat Analysis

Data Poisoning Attack

FedIF defends against Byzantine/adversarial clients in federated learning — including label flipping (data poisoning) and gradient attacks — by computing per-client influence scores to adaptively downweight malicious participants during aggregation. This is a robust aggregation defense against training-time data and gradient poisoning in FL.

Details

Domains

federated-learningvision

Model Types

federatedcnn

Threat Tags

training_timegrey_box

Datasets

CIFAR-10Fashion-MNIST

Applications

2025 0 cit.

Data Poisoning Attack

85%