PTOPOFL: Privacy-Preserving Personalised Federated Learning via Persistent Homology
Kelly L Vomo-Donfack 1,2, Adryel Hoszu 1,2, Grégory Ginot 1, Ian Morilla 1,2
1 Université Sorbonne Paris Nord
2 Instituto de Hortofruticultura Subtropical y Mediterránea La Mayora
Published on arXiv
2603.04323
Model Inversion Attack
OWASP ML Top 10 — ML03
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
PTOPOFL achieves AUC 0.841 and 0.910 on two non-IID FL benchmarks (best in class) while reducing gradient-inversion reconstruction risk by a factor of 4.5 relative to standard gradient sharing.
PTOPOFL
Novel technique introduced
Federated learning (FL) faces two structural tensions: gradient sharing enables data-reconstruction attacks, while non-IID client distributions degrade aggregation quality. We introduce PTOPOFL, a framework that addresses both challenges simultaneously by replacing gradient communication with topological descriptors derived from persistent homology (PH). Clients transmit only 48-dimensional PH feature vectors-compact shape summaries whose many-to-one structure makes inversion provably ill-posed-rather than model gradients. The server performs topology-guided personalised aggregation: clients are clustered by Wasserstein similarity between their PH diagrams, intra-cluster models are topology-weighted,and clusters are blended with a global consensus. We prove an information-contraction theorem showing that PH descriptors leak strictly less mutual information per sample than gradients under strongly convex loss functions, and we establish linear convergence of the Wasserstein-weighted aggregation scheme with an error floor strictly smaller than FedAvg. Evaluated against FedAvg, FedProx, SCAFFOLD, and pFedMe on a non-IID healthcare scenario (8 hospitals, 2 adversarial) and a pathological benchmark (10 clients), PTOPOFL achieves AUC 0.841 and 0.910 respectively-the highest in both settings-while reducing reconstruction risk by a factor of 4.5 relative to gradient sharing. Code is publicly available at https://github.com/MorillaLab/TopoFederatedL and data at https://doi.org/10.5281/zenodo.18827595.
Key Contributions
- Replaces gradient communication with 48-dimensional persistent homology (PH) feature vectors, provably reducing reconstruction risk by 4.5x via an information-contraction theorem
- Wasserstein-weighted personalised aggregation clusters clients by topological similarity, achieving higher AUC than FedAvg, FedProx, SCAFFOLD, and pFedMe on non-IID benchmarks
- Topology-based anomaly detection exponentially suppresses adversarial client influence, with formal proof of linear convergence to an error floor below FedAvg
🛡️ Threat Analysis
Explicitly models 2 adversarial clients in the healthcare FL scenario; proves adversarial client influence decays exponentially in topological separation from honest majority (vs. linear scaling in FedAvg); topology-based anomaly detection flags and down-weights poisoning sources.
Primary security contribution is defending against gradient inversion attacks (citing Zhu et al., Geiping et al.) by transmitting 48-dim PH feature vectors whose many-to-one structure makes reconstruction provably ill-posed; formally proves a 4.5x reduction in per-sample mutual information leakage relative to gradient sharing.