defense 2025

Defending Against Beta Poisoning Attacks in Machine Learning Models

Nilufer Gulciftci 1, M. Emre Gursoy 2

0 citations

α

Published on arXiv

2508.01276

Data Poisoning Attack

OWASP ML Top 10 — ML02

Key Finding

KPB and MDT achieve perfect accuracy and F1 scores (1.0) on both MNIST and CIFAR-10 against Beta Poisoning; CBD and NCC also provide strong but slightly lower performance.

KPB / NCC / CBD / MDT

Novel technique introduced


Poisoning attacks, in which an attacker adversarially manipulates the training dataset of a machine learning (ML) model, pose a significant threat to ML security. Beta Poisoning is a recently proposed poisoning attack that disrupts model accuracy by making the training dataset linearly nonseparable. In this paper, we propose four defense strategies against Beta Poisoning attacks: kNN Proximity-Based Defense (KPB), Neighborhood Class Comparison (NCC), Clustering-Based Defense (CBD), and Mean Distance Threshold (MDT). The defenses are based on our observations regarding the characteristics of poisoning samples generated by Beta Poisoning, e.g., poisoning samples have close proximity to one another, and they are centered near the mean of the target class. Experimental evaluations using MNIST and CIFAR-10 datasets demonstrate that KPB and MDT can achieve perfect accuracy and F1 scores, while CBD and NCC also provide strong defensive capabilities. Furthermore, by analyzing performance across varying parameters, we offer practical insights regarding defenses' behaviors under varying conditions.


Key Contributions

  • Empirical analysis of Beta Poisoning sample characteristics: high mutual proximity and clustering near target class mean
  • Four complementary defenses (KPB, NCC, CBD, MDT) that exploit these structural properties to detect and filter poisoning samples
  • Experimental evaluation on MNIST and CIFAR-10 showing KPB and MDT achieve perfect accuracy and F1 scores (1.0) across all conditions

🛡️ Threat Analysis

Data Poisoning Attack

Paper directly defends against Beta Poisoning, a data poisoning attack that injects maliciously crafted training samples to make the dataset linearly nonseparable and degrade model accuracy. All four proposed defenses (KPB, NCC, CBD, MDT) are data sanitization methods targeting training-time poisoning — the canonical ML02 threat.


Details

Domains
vision
Model Types
traditional_mlcnn
Threat Tags
training_timedigitaluntargeted
Datasets
MNISTCIFAR-10
Applications
image classification