Membership Inference Test: Auditing Training Data in Object Classification Models

In this research, we analyze the performance of Membership Inference Tests (MINT), focusing on determining whether given data were utilized during the training phase, specifically in the domain of object recognition. Within the area of object recognition, we propose and develop architectures tailored for MINT models. These architectures aim to optimize performance and efficiency in data utilization, offering a tailored solution to tackle the complexities inherent in the object recognition domain. We conducted experiments involving an object detection model, an embedding extractor, and a MINT module. These experiments were performed in three public databases, totaling over 174K images. The proposed architecture leverages convolutional layers to capture and model the activation patterns present in the data during the training process. Through our analysis, we are able to identify given data used for testing and training, achieving precision rates ranging between 70% and 80%, contingent upon the depth of the detection module layer chosen for input to the MINT module. Additionally, our studies entail an analysis of the factors influencing the MINT Module, delving into the contributing elements behind more transparent training processes.

Key Contributions

Novel CNN-based MINT architectures tailored for the object classification domain, leveraging activation patterns from intermediate detection layers
Comprehensive evaluation across three public datasets totaling 174K+ images under multiple experimental scenarios
Analysis of factors (e.g., detection module layer depth) influencing MINT performance, achieving 70–80% precision distinguishing training from test samples

🛡️ Threat Analysis

Membership Inference Attack

The paper directly proposes and evaluates membership inference architectures (MINT) that determine whether specific images were part of a model's training set — the core ML04 threat. The paper uses activation patterns and embeddings from object detection models to train binary classifiers that distinguish training from test samples.