FedGuard: A Diverse-Byzantine-Robust Mechanism for Federated Learning with Major Malicious Clients
Haocheng Jiang 1, Hua Shen 1, Jixin Zhang 1, Willy Susilo 2, Mingwu Zhang 1
Published on arXiv
2508.00636
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
FedGuard significantly outperforms existing robust FL aggregation schemes under 90% malicious Byzantine clients executing 7 concurrent attack types on highly non-IID data
FedGuard
Novel technique introduced
Federated learning is a distributed training framework vulnerable to Byzantine attacks, particularly when over 50% of clients are malicious or when datasets are highly non-independent and identically distributed (non-IID). Additionally, most existing defense mechanisms are designed for specific attack types (e.g., gradient similarity-based schemes can only defend against outlier model poisoning), limiting their effectiveness. In response, we propose FedGuard, a novel federated learning mechanism. FedGuard cleverly addresses the aforementioned issues by leveraging the high sensitivity of membership inference to model bias. By requiring clients to include an additional mini-batch of server-specified data in their training, FedGuard can identify and exclude poisoned models, as their confidence in the mini-batch will drop significantly. Our comprehensive evaluation unequivocally shows that, under three highly non-IID datasets, with 90% of clients being Byzantine and seven different types of Byzantine attacks occurring in each round, FedGuard significantly outperforms existing robust federated learning schemes in mitigating various types of Byzantine attacks.
Key Contributions
- FedGuard leverages high sensitivity of membership inference to model bias — poisoned models lose confidence on server-specified mini-batches, enabling detection without relying on gradient similarity
- Demonstrated robustness against 7 concurrent Byzantine attack types with up to 90% malicious clients under highly non-IID data distributions
- Overcomes limitations of gradient similarity-based defenses (which fail against stealthy similarity attacks) and trusted-data defenses (which degrade above ~60% malicious clients)
🛡️ Threat Analysis
FedGuard defends against Byzantine attacks in federated learning where malicious clients (up to 90%) send corrupted model updates to degrade global model performance — the canonical ML02 federated poisoning threat. The paper proposes a novel robust aggregation mechanism that identifies and excludes poisoned models using membership inference sensitivity as a detection signal.