Repurposing Backdoors for Good: Ephemeral Intrinsic Proofs for Verifiable Aggregation in Cross-silo Federated Learning
Xian Qin , Xue Yang , Xiaohu Tang
Published on arXiv
2603.10692
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
Achieves over 1000× speedup on ResNet-18 compared to cryptographic baselines (ZKPs, HE) with high detection probability against malicious servers omitting or tampering with client updates
Ephemeral Intrinsic Proofs
Novel technique introduced
While Secure Aggregation (SA) protects update confidentiality in Cross-silo Federated Learning, it fails to guarantee aggregation integrity, allowing malicious servers to silently omit or tamper with updates. Existing verifiable aggregation schemes rely on heavyweight cryptography (e.g., ZKPs, HE), incurring computational costs that scale poorly with model size. In this paper, we propose a lightweight architecture that shifts from extrinsic cryptographic proofs to \textit{Intrinsic Proofs}. We repurpose backdoor injection to embed verification signals directly into model parameters. By harnessing Catastrophic Forgetting, these signals are robust for immediate verification yet ephemeral, naturally decaying to preserve final model utility. We design a randomized, single-verifier auditing framework compatible with SA, ensuring client anonymity and preventing signal collision without trusted third parties. Experiments on SVHN, CIFAR-10, and CIFAR-100 demonstrate high detection probabilities against malicious servers. Notably, our approach achieves over $1000\times$ speedup on ResNet-18 compared to cryptographic baselines, effectively scaling to large models.
Key Contributions
- Introduces Intrinsic Proofs — verification signals embedded in model parameters via repurposed backdoor injection, replacing heavyweight cryptographic proofs for FL aggregation integrity
- Leverages Catastrophic Forgetting to make verification signals ephemeral: robustly detectable immediately after aggregation but decaying naturally to preserve final model utility
- Designs a randomized, single-verifier auditing framework compatible with Secure Aggregation, enabling client anonymity and collision-free verification without trusted third parties
🛡️ Threat Analysis
The threat model is a malicious server corrupting the FL training process by omitting or tampering with client model updates during aggregation — an attack on training integrity in federated learning. The paper's primary contribution is a defense ensuring aggregation correctness, fitting within the FL training poisoning / Byzantine-fault-tolerant FL domain that ML02 covers.