attack 2026

Finding Connections: Membership Inference Attacks for the Multi-Table Synthetic Data Setting

Joshua Ward 1, Chi-Hua Wang 2, Guang Cheng 1

0 citations · 40 references · arXiv (Cornell University)

α

Published on arXiv

2602.07126

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

MT-MIA achieves near-perfect AUC on constructed privacy-leakage scenarios where existing single-table MIAs perform no better than random guessing, confirming user-level vulnerability in state-of-the-art relational synthetic data generators.

MT-MIA (Multi-Table Membership Inference Attack)

Novel technique introduced


Synthetic tabular data has gained attention for enabling privacy-preserving data sharing. While substantial progress has been made in single-table synthetic generation where data are modeled at the row or item level, most real-world data exists in relational databases where a user's information spans items across multiple interconnected tables. Recent advances in synthetic relational data generation have emerged to address this complexity, yet release of these data introduce unique privacy challenges as information can be leaked not only from individual items but also through the relationships that comprise a complete user entity. To address this, we propose a novel Membership Inference Attack (MIA) setting to audit the empirical user-level privacy of synthetic relational data and show that single-table MIAs that audit at an item level underestimate user-level privacy leakage. We then propose Multi-Table Membership Inference Attack (MT-MIA), a novel adversarial attack under a No-Box threat model that targets learned representations of user entities via Heterogeneous Graph Neural Networks. By incorporating all connected items for a user, MT-MIA better targets user-level vulnerabilities induced by inter-tabular relationships than existing attacks. We evaluate MT-MIA on a range of real-world multi-table datasets and demonstrate that this vulnerability exists in state-of-the-art relational synthetic data generators, employing MT-MIA to additionally study where this leakage occurs.


Key Contributions

  • Defines a novel user-level MIA setting for multi-table synthetic relational data, showing single-table MIAs underestimate privacy leakage at the user level
  • Proposes MT-MIA, a No-Box membership inference attack using Heterogeneous Graph Neural Networks to encode user-centric subgraphs spanning multiple tables
  • Empirically demonstrates that state-of-the-art relational synthetic data generators are vulnerable to user-level membership inference, and uses MT-MIA embedding analysis to locate where memorization occurs

🛡️ Threat Analysis

Membership Inference Attack

The paper's primary contribution is a novel membership inference attack (MT-MIA) that determines whether a specific user's data (as a subgraph across multiple tables) was in the training set of a relational synthetic data generator — textbook ML04.


Details

Domains
tabulargraph
Model Types
gnngan
Threat Tags
black_boxinference_timetargeted
Datasets
real-world multi-table relational datasets (unspecified in excerpt)
Applications
synthetic data generationrelational database privacy auditingprivacy-preserving data sharing