Scalable and Verifiable Federated Learning for Cross-Institution Financial Fraud Detection

The global financial ecosystem confronts a critical asymmetry: while fraud syndicates operate as borderless, distributed networks, banking institutions remain constrained by regulatory data silos, limiting visibility into cross-institutional threat patterns under strict privacy laws such as GDPR. Although Federated Learning (FL) enables collaborative training, existing protocols impose a trade-off among scalability, privacy, and integrity. Homomorphic encryption schemes are computationally expensive, while pairwise masking protocols require O(N^2) key exchanges and lack mechanisms to detect malformed updates. Existing defenses also remain vulnerable to gradient inversion attacks that can reconstruct sensitive transaction data. To address these limitations, we propose Dynamic Sharded Federated Learning (DSFL), a verifiable secure aggregation framework for cross-institution financial fraud detection. DSFL replaces mesh topologies with Dynamic Stochastic Sharding, reducing communication complexity from O(N^2) to O(N m), where m is a fixed shard size, achieving linear scalability. To mitigate insider threats, we introduce Linear Integrity Tags, an additive-homomorphic commitment mechanism that enables probabilistic verification of submitted updates without the overhead of zero-knowledge proofs, while not enforcing semantic correctness. Additionally, the Active Neighborhood Recovery protocol ensures robust aggregation under participant dropouts. Empirical evaluation on the Credit Card Fraud Detection Dataset (ULB) demonstrates an approximately 33x latency reduction compared to Paillier-based secure aggregation, while maintaining strong resilience under simulated failures. These results position DSFL as a practical foundation for scalable and privacy-preserving collaborative fraud detection.

Key Contributions

Dynamic Stochastic Sharding reduces communication complexity from O(N²) to O(N·m) for federated learning
Linear Integrity Tags provide additive-homomorphic verification of gradient updates without zero-knowledge proof overhead
Active Neighborhood Recovery protocol ensures robust aggregation under participant dropouts

🛡️ Threat Analysis

Data Poisoning Attack

Paper addresses model poisoning attacks where malicious participants inject invalid gradients to degrade the global model, proposing Linear Integrity Tags for verification.

Model Inversion Attack

Paper explicitly addresses gradient inversion attacks that reconstruct sensitive transaction data from shared gradients, proposing Dynamic Stochastic Sharding and secure aggregation as defenses.