Repurposing Backdoors for Good: Ephemeral Intrinsic Proofs for Verifiable Aggregation in Cross-silo Federated Learning

While Secure Aggregation (SA) protects update confidentiality in Cross-silo Federated Learning, it fails to guarantee aggregation integrity, allowing malicious servers to silently omit or tamper with updates. Existing verifiable aggregation schemes rely on heavyweight cryptography (e.g., ZKPs, HE), incurring computational costs that scale poorly with model size. In this paper, we propose a lightweight architecture that shifts from extrinsic cryptographic proofs to \textit{Intrinsic Proofs}. We repurpose backdoor injection to embed verification signals directly into model parameters. By harnessing Catastrophic Forgetting, these signals are robust for immediate verification yet ephemeral, naturally decaying to preserve final model utility. We design a randomized, single-verifier auditing framework compatible with SA, ensuring client anonymity and preventing signal collision without trusted third parties. Experiments on SVHN, CIFAR-10, and CIFAR-100 demonstrate high detection probabilities against malicious servers. Notably, our approach achieves over $1000\times$ speedup on ResNet-18 compared to cryptographic baselines, effectively scaling to large models.

Key Contributions

Introduces Intrinsic Proofs — verification signals embedded in model parameters via repurposed backdoor injection, replacing heavyweight cryptographic proofs for FL aggregation integrity
Leverages Catastrophic Forgetting to make verification signals ephemeral: robustly detectable immediately after aggregation but decaying naturally to preserve final model utility
Designs a randomized, single-verifier auditing framework compatible with Secure Aggregation, enabling client anonymity and collision-free verification without trusted third parties

🛡️ Threat Analysis

Data Poisoning Attack

The threat model is a malicious server corrupting the FL training process by omitting or tampering with client model updates during aggregation — an attack on training integrity in federated learning. The paper's primary contribution is a defense ensuring aggregation correctness, fitting within the FL training poisoning / Byzantine-fault-tolerant FL domain that ML02 covers.

Details

Domains

federated-learning

Model Types

federatedcnn

Threat Tags

training_timegrey_box

Datasets

SVHNCIFAR-10CIFAR-100

Applications

2025 1 cit.

Data Poisoning Attack

90%