defense 2025

Privacy-Utility Trade-off in Data Publication: A Bilevel Optimization Framework with Curvature-Guided Perturbation

Yi Yin , Guangquan Zhang , Hua Zuo , Jie Lu

0 citations

α

Published on arXiv

2509.02048

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

Proposed method enhances resistance to MIA in downstream tasks while surpassing existing privacy-preserving methods in sample quality and diversity.

Curvature-Guided Bilevel Optimization

Novel technique introduced


Machine learning models require datasets for effective training, but directly sharing raw data poses significant privacy risk such as membership inference attacks (MIA). To mitigate the risk, privacy-preserving techniques such as data perturbation, generalization, and synthetic data generation are commonly utilized. However, these methods often degrade data accuracy, specificity, and diversity, limiting the performance of downstream tasks and thus reducing data utility. Therefore, striking an optimal balance between privacy preservation and data utility remains a critical challenge. To address this issue, we introduce a novel bilevel optimization framework for the publication of private datasets, where the upper-level task focuses on data utility and the lower-level task focuses on data privacy. In the upper-level task, a discriminator guides the generation process to ensure that perturbed latent variables are mapped to high-quality samples, maintaining fidelity for downstream tasks. In the lower-level task, our framework employs local extrinsic curvature on the data manifold as a quantitative measure of individual vulnerability to MIA, providing a geometric foundation for targeted privacy protection. By perturbing samples toward low-curvature regions, our method effectively suppresses distinctive feature combinations that are vulnerable to MIA. Through alternating optimization of both objectives, we achieve a synergistic balance between privacy and utility. Extensive experimental evaluations demonstrate that our method not only enhances resistance to MIA in downstream tasks but also surpasses existing methods in terms of sample quality and diversity.


Key Contributions

  • Bilevel optimization framework where the upper level preserves data utility via a discriminator-guided generation process and the lower level protects privacy by minimizing MIA vulnerability
  • Novel use of local extrinsic curvature on the data manifold as a geometric measure of individual sample vulnerability to membership inference attacks
  • Curvature-guided perturbation that shifts samples toward low-curvature manifold regions, suppressing distinctive feature combinations that make samples identifiable to MIA

🛡️ Threat Analysis

Membership Inference Attack

The paper's lower-level optimization objective is explicitly designed to defend against membership inference attacks (MIA). It uses local extrinsic curvature to quantify individual sample vulnerability to MIA and perturbs high-curvature (high-risk) samples toward low-curvature regions, directly suppressing distinctive feature combinations that enable MIA. The paper evaluates resistance to MIA as the primary privacy metric.


Details

Domains
visiongenerative
Model Types
gantraditional_ml
Threat Tags
training_timeblack_box
Datasets
ADNI
Applications
private dataset publicationprivacy-preserving data sharingimage data de-identification