Spectral Sentinel: Scalable Byzantine-Robust Decentralized Federated Learning via Sketched Random Matrix Theory on Blockchain
Published on arXiv
2512.12617
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
Achieves 78.4% average accuracy across 144 attack-aggregator configurations versus 48–63% for baseline Byzantine-robust aggregators including Krum and geometric median.
Spectral Sentinel
Novel technique introduced
Decentralized federated learning (DFL) enables collaborative model training without centralized trust, but it remains vulnerable to Byzantine clients that poison gradients under heterogeneous (Non-IID) data. Existing defenses face a scalability trilemma: distance-based filtering (e.g., Krum) can reject legitimate Non-IID updates, geometric-median methods incur prohibitive $O(n^2 d)$ cost, and many certified defenses are evaluated only on models below 100M parameters. We propose Spectral Sentinel, a Byzantine detection and aggregation framework that leverages a random-matrix-theoretic signature: honest Non-IID gradients produce covariance eigenspectra whose bulk follows the Marchenko-Pastur law, while Byzantine perturbations induce detectable tail anomalies. Our algorithm combines Frequent Directions sketching with data-dependent MP tracking, enabling detection on models up to 1.5B parameters using $O(k^2)$ memory with $k \ll d$. Under a $(σ,f)$ threat model with coordinate-wise honest variance bounded by $σ^2$ and $f < 1/2$ adversaries, we prove $(ε,δ)$-Byzantine resilience with convergence rate $O(σf / \sqrt{T} + f^2 / T)$, and we provide a matching information-theoretic lower bound $Ω(σf / \sqrt{T})$, establishing minimax optimality. We implement the full system with blockchain integration on Polygon networks and validate it across 144 attack-aggregator configurations, achieving 78.4 percent average accuracy versus 48-63 percent for baseline methods.
Key Contributions
- Spectral detection framework exploiting the Marchenko-Pastur law: honest Non-IID gradients conform to it while Byzantine perturbations produce detectable tail anomalies in eigenspectra
- Frequent Directions sketching reduces memory from O(d²) to O(k²), enabling Byzantine-robust aggregation on models up to 1.5B parameters
- Minimax-optimal convergence rate O(σf/√T + f²/T) with matching information-theoretic lower bound Ω(σf/√T), plus full production deployment on Polygon blockchain
🛡️ Threat Analysis
Defends against Byzantine clients who send poisoned gradient updates during federated training to degrade global model performance — the canonical ML02 threat in federated learning. The paper explicitly frames the problem as Byzantine gradient poisoning and proposes a robust aggregation framework to filter malicious updates.