FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning

Federated learning (FL) enables collaborative model training while preserving data privacy. However, it remains vulnerable to malicious clients who compromise model integrity through Byzantine attacks, data poisoning, or adaptive adversarial behaviors. Existing defense mechanisms rely on static thresholds and binary classification, failing to adapt to evolving client behaviors in real-world deployments. We propose FLARE, an adaptive reputation-based framework that transforms client reliability assessment from binary decisions to a continuous, multi-dimensional trust evaluation. FLARE integrates: (i) a multi-dimensional reputation score capturing performance consistency, statistical anomaly indicators, and temporal behavior, (ii) a self-calibrating adaptive threshold mechanism that adjusts security strictness based on model convergence and recent attack intensity, (iii) reputation-weighted aggregation with soft exclusion to proportionally limit suspicious contributions rather than eliminating clients outright, and (iv) a Local Differential Privacy (LDP) mechanism enabling reputation scoring on privatized client updates. We further introduce a highly evasive Statistical Mimicry (SM) attack, a benchmark adversary that blends honest gradients with synthetic perturbations and persistent drift to remain undetected by traditional filters. Extensive experiments with 100 clients on MNIST, CIFAR-10, and SVHN demonstrate that FLARE maintains high model accuracy and converges faster than state-of-the-art Byzantine-robust methods under diverse attack types, including label flipping, gradient scaling, adaptive attacks, ALIE, and SM. FLARE improves robustness by up to 16% and preserves model convergence within 30% of the non-attacked baseline, while achieving strong malicious-client detection performance with minimal computational overhead. https://github.com/Anonymous0-0paper/FLARE

Key Contributions

FLARE: a multi-dimensional, adaptive reputation framework for FL that replaces binary client exclusion with continuous trust scoring across performance, statistical-anomaly, and temporal dimensions
Self-calibrating adaptive threshold that adjusts security strictness based on model convergence and recent attack intensity, with reputation-weighted soft exclusion and an integrated Local Differential Privacy mechanism
Statistical Mimicry (SM) attack — a novel evasive adversary that blends honest gradients with synthetic perturbations and persistent drift to evade traditional statistical filters

🛡️ Threat Analysis

Data Poisoning Attack

Core threat model is malicious FL clients corrupting training via Byzantine attacks (gradient scaling, ALIE), label flipping, and the novel Statistical Mimicry poisoning attack; FLARE defends through reputation-weighted aggregation and soft exclusion of suspicious client updates at training time.

Details

Domains

federated-learning

Model Types

federated

Threat Tags

training_timegrey_boxuntargeted

Datasets

MNISTCIFAR-10SVHN

Applications

2025 0 cit.

Data Poisoning Attack

91%