Membership Inference Test: Auditing Training Data in Object Classification Models
Gonzalo Mancera , Daniel DeAlcala , Aythami Morales , Ruben Tolosana , Julian Fierrez
Published on arXiv
2601.12929
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
Achieves 70–80% precision in distinguishing training from test samples on object classification models, with performance dependent on the depth of the detection layer used as MINT module input.
MINT (Membership Inference Test)
Novel technique introduced
In this research, we analyze the performance of Membership Inference Tests (MINT), focusing on determining whether given data were utilized during the training phase, specifically in the domain of object recognition. Within the area of object recognition, we propose and develop architectures tailored for MINT models. These architectures aim to optimize performance and efficiency in data utilization, offering a tailored solution to tackle the complexities inherent in the object recognition domain. We conducted experiments involving an object detection model, an embedding extractor, and a MINT module. These experiments were performed in three public databases, totaling over 174K images. The proposed architecture leverages convolutional layers to capture and model the activation patterns present in the data during the training process. Through our analysis, we are able to identify given data used for testing and training, achieving precision rates ranging between 70% and 80%, contingent upon the depth of the detection module layer chosen for input to the MINT module. Additionally, our studies entail an analysis of the factors influencing the MINT Module, delving into the contributing elements behind more transparent training processes.
Key Contributions
- Novel CNN-based MINT architectures tailored for the object classification domain, leveraging activation patterns from intermediate detection layers
- Comprehensive evaluation across three public datasets totaling 174K+ images under multiple experimental scenarios
- Analysis of factors (e.g., detection module layer depth) influencing MINT performance, achieving 70–80% precision distinguishing training from test samples
🛡️ Threat Analysis
The paper directly proposes and evaluates membership inference architectures (MINT) that determine whether specific images were part of a model's training set — the core ML04 threat. The paper uses activation patterns and embeddings from object detection models to train binary classifiers that distinguish training from test samples.