attack 2025

Membership Inference over Diffusion-models-based Synthetic Tabular Data

Peini Cheng , Amir Bahmani

1 citations · 1 influential · 18 references · arXiv

α

Published on arXiv

2510.16037

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

TabDDPM is substantially more vulnerable to query-based membership inference attacks than TabSyn, which exhibits notable resilience against the proposed step-wise error comparison attack.

Step-wise Error Comparison MIA

Novel technique introduced


This study investigates the privacy risks associated with diffusion-based synthetic tabular data generation methods, focusing on their susceptibility to Membership Inference Attacks (MIAs). We examine two recent models, TabDDPM and TabSyn, by developing query-based MIAs based on the step-wise error comparison method. Our findings reveal that TabDDPM is more vulnerable to these attacks. TabSyn exhibits resilience against our attack models. Our work underscores the importance of evaluating the privacy implications of diffusion models and encourages further research into robust privacy-preserving mechanisms for synthetic data generation.


Key Contributions

  • Query-based membership inference attacks using step-wise error comparison against diffusion-based tabular data generators
  • Comparative privacy vulnerability analysis of TabDDPM vs. TabSyn showing TabDDPM is significantly more susceptible to MIA
  • Demonstrates that DCR (Distance to Closest Record), the standard privacy metric for synthetic data, is insufficient for capturing MIA-based privacy risks

🛡️ Threat Analysis

Membership Inference Attack

The paper's sole contribution is designing and evaluating MIAs — specifically query-based step-wise error comparison attacks — to determine whether specific records were in the training sets of TabDDPM and TabSyn diffusion models.


Details

Domains
tabulargenerative
Model Types
diffusion
Threat Tags
black_boxtraining_time
Applications
synthetic tabular data generationprivacy-preserving data sharing