α

Published on arXiv

2510.12143

Data Poisoning Attack

OWASP ML Top 10 — ML02

Key Finding

A single malicious FL client using fairness-constrained optimization can increase model bias by up to 90% while maintaining global accuracy, successfully bypassing Byzantine-robust and fairness-aware aggregation defenses.

Fairness-Constrained Optimization Attack

Novel technique introduced


Federated learning (FL) is a privacy-preserving machine learning technique that facilitates collaboration among participants across demographics. FL enables model sharing, while restricting the movement of data. Since FL provides participants with independence over their training data, it becomes susceptible to poisoning attacks. Such collaboration also propagates bias among the participants, even unintentionally, due to different data distribution or historical bias present in the data. This paper proposes an intentional fairness attack, where a client maliciously sends a biased model, by increasing the fairness loss while training, even considering homogeneous data distribution. The fairness loss is calculated by solving an optimization problem for fairness metrics such as demographic parity and equalized odds. The attack is insidious and hard to detect, as it maintains global accuracy even after increasing the bias. We evaluate our attack against the state-of-the-art Byzantine-robust and fairness-aware aggregation schemes over different datasets, in various settings. The empirical results demonstrate the attack efficacy by increasing the bias up to 90\%, even in the presence of a single malicious client in the FL system.


Key Contributions

  • Novel fairness-constrained optimization attack in FL that maximizes fairness loss (demographic parity, equalized odds) while preserving global classification accuracy to evade detection
  • Demonstrates the attack succeeds against state-of-the-art Byzantine-robust (e.g., Krum, Trimmed Mean) and fairness-aware aggregation schemes (FairFed, FairTrade)
  • Shows that a single malicious client can increase model bias by up to 90% across diverse dataset settings and data distributions

🛡️ Threat Analysis

Data Poisoning Attack

The attack is a Byzantine attack in federated learning — a malicious client sends manipulated model updates (solving a fairness-constrained optimization to maximize fairness loss) while maintaining accuracy to evade Byzantine-robust and fairness-aware aggregation defenses. This maps directly to ML02's explicit coverage of Byzantine attacks where malicious FL clients send adversarial model updates.


Details

Domains
federated-learningtabular
Model Types
federatedtraditional_ml
Threat Tags
training_timegrey_boxtargeted
Datasets
Adult CensusCOMPAS
Applications
federated learningfair classificationcredit scoringrecidivism prediction