defense 2026

Generalization and Membership Inference Attack a Practical Perspective

Fateme Rahmani , Mahdi Jafari Siavoshani , Mohammad Hossein Rohban

0 citations

α

Published on arXiv

2604.19936

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

Advanced generalization techniques can reduce membership inference attack performance by up to 100× while improving model accuracy


With the emergence of new evaluation metrics and attack methodologies for Membership Inference Attacks (MIA), it becomes essential to reevaluate previously accepted assumptions. In this paper, we revisit the longstanding debate regarding the correlation between MIA success rates and model generalization using an empirical approach. We focused on employing augmentation techniques and early stopping to enhance model generalization and examined their impact on MIA success rates. We found that utilizing advanced generalization techniques can significantly decrease attack performance, potentially by up to 100 times. Moreover, combining these methods not only improves model generalization but also reduces attack effectiveness by introducing randomness during training. Additionally, our study confirmed the direct impact of generalization on MIA performance through an analysis of over 1K models in a controlled environment.


Key Contributions

  • Empirical analysis of 1K+ models showing generalization techniques (augmentation, early stopping) reduce MIA success by up to 100×
  • Demonstrates that combining generalization methods introduces training randomness that further reduces attack effectiveness
  • Confirms direct causal relationship between model generalization and MIA vulnerability in controlled experiments

🛡️ Threat Analysis

Membership Inference Attack

Paper's core focus is membership inference attacks — determining whether specific data points were in the training set. Evaluates MIA performance using Lira attack and TPR@0.1%FPR metric, and tests defenses (augmentation, early stopping) against MIA.


Details

Domains
vision
Model Types
cnn
Threat Tags
black_boxwhite_boxtraining_time
Applications
image classification