Stealthy Poisoning Attacks Bypass Defenses in Regression Settings
Javier Carnerero-Cano 1,2, Luis Muñoz-González 3, Phillippa Spencer 4,5, Emil C. Lupu 2
Published on arXiv
2601.22308
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
Stealthy multiobjective bilevel poisoning attacks bypass all evaluated state-of-the-art regression defenses, while BayesClean reduces attack impact when attacks are stealthy and the fraction of poisoned points is significant.
BayesClean
Novel technique introduced
Regression models are widely used in industrial processes, engineering and in natural and physical sciences, yet their robustness to poisoning has received less attention. When it has, studies often assume unrealistic threat models and are thus less useful in practice. In this paper, we propose a novel optimal stealthy attack formulation that considers different degrees of detectability and show that it bypasses state-of-the-art defenses. We further propose a new methodology based on normalization of objectives to evaluate different trade-offs between effectiveness and detectability. Finally, we develop a novel defense (BayesClean) against stealthy attacks. BayesClean improves on previous defenses when attacks are stealthy and the number of poisoning points is significant.
Key Contributions
- Novel stealthy poisoning attack formulation using multiobjective bilevel optimization that trades off attack effectiveness against detectability risk, shown to bypass existing defenses without being adaptive to any specific one
- Normalization-based evaluation methodology for comparing effectiveness-detectability trade-offs across attacks and defenses
- BayesClean: a Bayesian linear regression defense that rejects suspicious training points based on predictive variance, outperforming prior defenses under stealthy poisoning with many poisoned points
🛡️ Threat Analysis
Proposes a novel multiobjective bilevel optimization formulation for crafting stealthy poisoning attacks that corrupt regression model training data while minimizing detectability, and demonstrates that state-of-the-art data poisoning defenses fail against them. Also proposes BayesClean as a new defense against such poisoning attacks.