Benchmarking Robust Aggregation in Decentralized Gradient Marketplaces

The rise of distributed and privacy-preserving machine learning has sparked interest in decentralized gradient marketplaces, where participants trade intermediate artifacts like gradients. However, existing Federated Learning (FL) benchmarks overlook critical economic and systemic factors unique to such marketplaces-cost-effectiveness, fairness to sellers, and market stability-especially when a buyer relies on a private baseline dataset for evaluation. We introduce a comprehensive benchmark framework to holistically evaluate robust gradient aggregation methods within these buyer-baseline-reliant marketplaces. Our contributions include: (1) a simulation environment modeling marketplace dynamics with a variable buyer baseline and diverse seller distributions; (2) an evaluation methodology augmenting standard FL metrics with marketplace-centric dimensions such as Economic Efficiency, Fairness, and Selection Dynamics; (3) an in-depth empirical analysis of the existing Distributed Gradient Marketplace framework, MartFL, including the integration and comparative evaluation of adapted FLTrust and SkyMask as alternative aggregation strategies within it. This benchmark spans diverse datasets, local attacks, and Sybil attacks targeting the marketplace selection process; and (4) actionable insights into the trade-offs between model performance, robustness, cost, fairness, and stability. This benchmark equips the community with essential tools and empirical evidence to evaluate and design more robust, equitable, and economically viable decentralized gradient marketplaces.

Key Contributions

Simulation environment modeling decentralized gradient marketplace dynamics with variable buyer baselines and diverse seller distributions
Evaluation methodology augmenting standard FL metrics with marketplace-centric dimensions: Economic Efficiency, Fairness, and Selection Dynamics
Empirical comparative analysis of MartFL, FLTrust, and SkyMask under diverse local and Sybil attacks, yielding actionable robustness/cost/fairness trade-off insights

🛡️ Threat Analysis

Data Poisoning Attack

The paper evaluates defenses against Byzantine (malicious participants sending corrupted gradients) and Sybil attacks targeting the gradient marketplace selection process — both are canonical FL data/model-update poisoning threats. The benchmark tests FLTrust and SkyMask as robust aggregation defenses against these adversarial participants, squarely within the Byzantine-fault-tolerant FL defense space defined by ML02.