Scalable and Verifiable Federated Learning for Cross-Institution Financial Fraud Detection
Published on arXiv
2604.23437
Model Inversion Attack
OWASP ML Top 10 — ML03
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
Achieves approximately 33× latency reduction compared to Paillier-based secure aggregation while maintaining resilience under simulated failures
Dynamic Sharded Federated Learning (DSFL)
Novel technique introduced
The global financial ecosystem confronts a critical asymmetry: while fraud syndicates operate as borderless, distributed networks, banking institutions remain constrained by regulatory data silos, limiting visibility into cross-institutional threat patterns under strict privacy laws such as GDPR. Although Federated Learning (FL) enables collaborative training, existing protocols impose a trade-off among scalability, privacy, and integrity. Homomorphic encryption schemes are computationally expensive, while pairwise masking protocols require O(N^2) key exchanges and lack mechanisms to detect malformed updates. Existing defenses also remain vulnerable to gradient inversion attacks that can reconstruct sensitive transaction data. To address these limitations, we propose Dynamic Sharded Federated Learning (DSFL), a verifiable secure aggregation framework for cross-institution financial fraud detection. DSFL replaces mesh topologies with Dynamic Stochastic Sharding, reducing communication complexity from O(N^2) to O(N m), where m is a fixed shard size, achieving linear scalability. To mitigate insider threats, we introduce Linear Integrity Tags, an additive-homomorphic commitment mechanism that enables probabilistic verification of submitted updates without the overhead of zero-knowledge proofs, while not enforcing semantic correctness. Additionally, the Active Neighborhood Recovery protocol ensures robust aggregation under participant dropouts. Empirical evaluation on the Credit Card Fraud Detection Dataset (ULB) demonstrates an approximately 33x latency reduction compared to Paillier-based secure aggregation, while maintaining strong resilience under simulated failures. These results position DSFL as a practical foundation for scalable and privacy-preserving collaborative fraud detection.
Key Contributions
- Dynamic Stochastic Sharding reduces communication complexity from O(N²) to O(N·m) for federated learning
- Linear Integrity Tags provide additive-homomorphic verification of gradient updates without zero-knowledge proof overhead
- Active Neighborhood Recovery protocol ensures robust aggregation under participant dropouts
🛡️ Threat Analysis
Paper addresses model poisoning attacks where malicious participants inject invalid gradients to degrade the global model, proposing Linear Integrity Tags for verification.
Paper explicitly addresses gradient inversion attacks that reconstruct sensitive transaction data from shared gradients, proposing Dynamic Stochastic Sharding and secure aggregation as defenses.