attack 2025

CatBack: Universal Backdoor Attacks on Tabular Data via Categorical Encoding

Behrad Tajalli 1, Stefanos Koffas 2, Stjepan Picek 3,1

0 citations · 48 references · arXiv

α

Published on arXiv

2511.06072

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

Achieves up to 100% attack success rate on tabular ML models in both white-box and black-box settings while evading all four evaluated state-of-the-art backdoor defenses

CatBack

Novel technique introduced


Backdoor attacks in machine learning have drawn significant attention for their potential to compromise models stealthily, yet most research has focused on homogeneous data such as images. In this work, we propose a novel backdoor attack on tabular data, which is particularly challenging due to the presence of both numerical and categorical features. Our key idea is a novel technique to convert categorical values into floating-point representations. This approach preserves enough information to maintain clean-model accuracy compared to traditional methods like one-hot or ordinal encoding. By doing this, we create a gradient-based universal perturbation that applies to all features, including categorical ones. We evaluate our method on five datasets and four popular models. Our results show up to a 100% attack success rate in both white-box and black-box settings (including real-world applications like Vertex AI), revealing a severe vulnerability for tabular data. Our method is shown to surpass the previous works like Tabdoor in terms of performance, while remaining stealthy against state-of-the-art defense mechanisms. We evaluate our attack against Spectral Signatures, Neural Cleanse, Beatrix, and Fine-Pruning, all of which fail to defend successfully against it. We also verify that our attack successfully bypasses popular outlier detection mechanisms.


Key Contributions

  • Novel categorical-to-floating-point encoding technique that enables gradient-based universal perturbation triggers spanning both numerical and categorical tabular features
  • Universal backdoor attack (CatBack) achieving up to 100% attack success rate in white-box and black-box settings, including real-world Vertex AI deployment
  • Comprehensive evasion evaluation against Spectral Signatures, Neural Cleanse, Beatrix, Fine-Pruning, and outlier detection mechanisms — all of which fail to detect the attack

🛡️ Threat Analysis

Model Poisoning

CatBack is a backdoor/trojan attack: it injects hidden, trigger-activated malicious behavior into ML models trained on tabular data. The model behaves normally on clean inputs and only misbehaves when a crafted universal perturbation trigger is present. Evaluated against Neural Cleanse, Spectral Signatures, Beatrix, and Fine-Pruning — all canonical ML10 defenses.


Details

Domains
tabular
Model Types
traditional_ml
Threat Tags
white_boxblack_boxtraining_timetargeted
Datasets
Adult IncomeCOMPASBank MarketingCredit Card FraudCover Type
Applications
tabular classificationcloud ml apis (vertex ai)