LoMime: Query-Efficient Membership Inference using Model Extraction in Label-Only Settings

Membership inference attacks (MIAs) threaten the privacy of machine learning models by revealing whether a specific data point was used during training. Existing MIAs often rely on impractical assumptions such as access to public datasets, shadow models, confidence scores, or training data distribution knowledge and making them vulnerable to defenses like confidence masking and adversarial regularization. Label-only MIAs, even under strict constraints suffer from high query requirements per sample. We propose a cost-effective label-only MIA framework based on transferability and model extraction. By querying the target model M using active sampling, perturbation-based selection, and synthetic data, we extract a functionally similar surrogate S on which membership inference is performed. This shifts query overhead to a one-time extraction phase, eliminating repeated queries to M . Operating under strict black-box constraints, our method matches the performance of state-of-the-art label-only MIAs while significantly reducing query costs. On benchmarks including Purchase, Location, and Texas Hospital, we show that a query budget equivalent to testing $\approx1\%$ of training samples suffices to extract S and achieve membership inference accuracy within $\pm1\%$ of M . We also evaluate the effectiveness of standard defenses proposed for label-only MIAs against our attack.

Key Contributions

A label-only MIA framework (LoMime) that uses one-time model extraction to build a surrogate, eliminating repeated per-sample queries to the target model
Active sampling and perturbation-based selection strategies for efficient surrogate extraction using synthetic data under strict black-box constraints
Empirical demonstration that a query budget of ~1% of training samples suffices to achieve MIA accuracy within ±1% of the target, with evaluation of standard label-only defenses against the attack

🛡️ Threat Analysis

Membership Inference Attack

The paper's core contribution is a new MIA framework (LoMime) that determines whether specific data points were used in training, operating under strict label-only black-box constraints and evaluated on Purchase, Location, and Texas Hospital benchmarks.

Model Theft

Model extraction is a foundational, non-trivial component of the proposed attack — the paper actively develops active sampling and perturbation-based selection techniques to extract a functionally similar surrogate model from the target, which is then used for inference. The extraction methodology is a co-contribution, not a passing reference.