Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

Membership inference attacks (MIAs), which enable adversaries to determine whether specific data points were part of a model's training dataset, have emerged as an important framework to understand, assess, and quantify the potential information leakage associated with machine learning systems. Designing effective MIAs is a challenging task that usually requires extensive manual exploration of model behaviors to identify potential vulnerabilities. In this paper, we introduce AutoMIA -- a novel framework that leverages large language model (LLM) agents to automate the design and implementation of new MIA signal computations. By utilizing LLM agents, we can systematically explore a vast space of potential attack strategies, enabling the discovery of novel strategies. Our experiments demonstrate AutoMIA can successfully discover new MIAs that are specifically tailored to user-configured target model and dataset, resulting in improvements of up to 0.18 in absolute AUC over existing MIAs. This work provides the first demonstration that LLM agents can serve as an effective and scalable paradigm for designing and implementing MIAs with SOTA performance, opening up new avenues for future exploration.

Key Contributions

First LLM-agent-based framework for automated discovery of membership inference attack strategies
Achieves up to 0.18 absolute AUC improvement over existing MIAs through automated signal computation discovery
Demonstrates scalable paradigm for exploring vast MIA strategy space without manual attack design

🛡️ Threat Analysis

Membership Inference Attack

Core contribution is discovering and implementing membership inference attacks to determine if specific data points were in training data.

Details

Threat Tags

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

AutoMIA: Improved Baselines for Membership Inference Attack via Agentic Self-Exploration

Black-Box Auditing of Quantum Model: Lifted Differential Privacy with Quantum Canaries

Membership Inference Attacks on Tokenizers of Large Language Models

Your Privacy Depends on Others: Collusion Vulnerabilities in Individual Differential Privacy

Membership Inference over Diffusion-models-based Synthetic Tabular Data

On the Evidentiary Limits of Membership Inference for Copyright Auditing

Exponential-Family Membership Inference: From LiRA and RMIA to BaVarIA

Sequential Membership Inference Attacks