benchmark 2025

FedOnco-Bench: A Reproducible Benchmark for Privacy-Aware Federated Tumor Segmentation with Synthetic CT Data

Viswa Chaitanya Marella , Suhasnadh Reddy Veluru , Sai Teja Erukude

0 citations · 14 references · International Conference on In...

α

Published on arXiv

2511.00795

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

FedAvg achieves Dice ~0.85 but MIA AUC ~0.72, whereas DP-SGD reduces MIA AUC to ~0.25 at the cost of Dice ~0.79, demonstrating a clear privacy-utility tradeoff in federated tumor segmentation.

FedOnco-Bench

Novel technique introduced


Federated Learning (FL) allows multiple institutions to cooperatively train machine learning models while retaining sensitive data at the source, which has great utility in privacy-sensitive environments. However, FL systems remain vulnerable to membership-inference attacks and data heterogeneity. This paper presents FedOnco-Bench, a reproducible benchmark for privacy-aware FL using synthetic oncologic CT scans with tumor annotations. It evaluates segmentation performance and privacy leakage across FL methods: FedAvg, FedProx, FedBN, and FedAvg with DP-SGD. Results show a distinct trade-off between privacy and utility: FedAvg is high performance (Dice around 0.85) with more privacy leakage (attack AUC about 0.72), while DP-SGD provides a higher level of privacy (AUC around 0.25) at the cost of accuracy (Dice about 0.79). FedProx and FedBN offer balanced performance under heterogeneous data, especially with non-identical distributed client data. FedOnco-Bench serves as a standardized, open-source platform for benchmarking and developing privacy-preserving FL methods for medical image segmentation.


Key Contributions

  • Synthetic non-IID oncologic CT dataset distributed across simulated FL clients to replicate realistic clinical heterogeneity
  • Standardized evaluation of FedAvg, FedProx, FedBN, and FedAvg+DP-SGD on both segmentation quality (Dice) and privacy leakage (MIA AUC)
  • Quantified privacy-utility tradeoff: DP-SGD cuts MIA AUC from 0.72 to 0.25 at a cost of ~6 Dice points, while FedProx/FedBN offer balance under non-IID data

🛡️ Threat Analysis

Membership Inference Attack

Core evaluation metric is membership inference attack AUC across FL methods — FedAvg leaks at AUC ~0.72 while DP-SGD reduces it to ~0.25. The adversarial threat model is explicit: an attacker inferring whether specific patient data was in training.


Details

Domains
visionfederated-learning
Model Types
federatedcnn
Threat Tags
training_timeblack_box
Datasets
synthetic oncologic CT (custom)
Applications
medical image segmentationfederated oncology imaging