MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data
Eyal German , Daniel Samira , Yuval Elovici , Asaf Shabtai
Published on arXiv
2509.13046
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
Achieves AUC-ROC up to 0.599 and TPR@10% FPR of 22.0% against tabular diffusion models, placing 2nd in the MIDST 2025 Black-box Multi-Table track, demonstrating meaningful membership leakage in synthetic tabular data.
MIA-EPT
Novel technique introduced
Synthetic data generation plays an important role in enabling data sharing, particularly in sensitive domains like healthcare and finance. Recent advances in diffusion models have made it possible to generate realistic, high-quality tabular data, but they may also memorize training records and leak sensitive information. Membership inference attacks (MIAs) exploit this vulnerability by determining whether a record was used in training. While MIAs have been studied in images and text, their use against tabular diffusion models remains underexplored despite the unique risks of structured attributes and limited record diversity. In this paper, we introduce MIAEPT, Membership Inference Attack via Error Prediction for Tabular Data, a novel black-box attack specifically designed to target tabular diffusion models. MIA-EPT constructs errorbased feature vectors by masking and reconstructing attributes of target records, disclosing membership signals based on how well these attributes are predicted. MIA-EPT operates without access to the internal components of the generative model, relying only on its synthetic data output, and was shown to generalize across multiple state-of-the-art diffusion models. We validate MIA-EPT on three diffusion-based synthesizers, achieving AUC-ROC scores of up to 0.599 and TPR@10% FPR values of 22.0% in our internal tests. Under the MIDST 2025 competition conditions, MIA-EPT achieved second place in the Black-box Multi-Table track (TPR@10% FPR = 20.0%). These results demonstrate that our method can uncover substantial membership leakage in synthetic tabular data, challenging the assumption that synthetic data is inherently privacy-preserving. Our code is publicly available at https://github.com/eyalgerman/MIA-EPT.
Key Contributions
- Novel black-box MIA for tabular diffusion models that constructs error-based feature vectors by masking and reconstructing record attributes to surface membership signals
- Attack requires only access to the model's synthetic output — no internal model components — and generalizes across multiple state-of-the-art tabular diffusion synthesizers
- Demonstrated competitive real-world performance: 2nd place in MIDST 2025 Black-box Multi-Table competition with TPR@10% FPR = 20.0%
🛡️ Threat Analysis
MIA-EPT is explicitly a membership inference attack — it determines whether a specific tabular record was used in training a generative model. The error-prediction feature construction is the attack mechanism, and the goal is binary membership determination. This is the textbook ML04 use case.