$\oslash$ Source Models Leak What They Shouldn't $\nrightarrow$: Unlearning Zero-Shot Transfer in Domain Adaptation Through Adversarial Optimization

The increasing adaptation of vision models across domains, such as satellite imagery and medical scans, has raised an emerging privacy risk: models may inadvertently retain and leak sensitive source-domain specific information in the target domain. This creates a compelling use case for machine unlearning to protect the privacy of sensitive source-domain data. Among adaptation techniques, source-free domain adaptation (SFDA) calls for an urgent need for machine unlearning (MU), where the source data itself is protected, yet the source model exposed during adaptation encodes its influence. Our experiments reveal that existing SFDA methods exhibit strong zero-shot performance on source-exclusive classes in the target domain, indicating they inadvertently leak knowledge of these classes into the target domain, even when they are not represented in the target data. We identify and address this risk by proposing an MU setting called SCADA-UL: Unlearning Source-exclusive ClAsses in Domain Adaptation. Existing MU methods do not address this setting as they are not designed to handle data distribution shifts. We propose a new unlearning method, where an adversarially generated forget class sample is unlearned by the model during the domain adaptation process using a novel rescaled labeling strategy and adversarial optimization. We also extend our study to two variants: a continual version of this problem setting and to one where the specific source classes to be forgotten may be unknown. Alongside theoretical interpretations, our comprehensive empirical results show that our method consistently outperforms baselines in the proposed setting while achieving retraining-level unlearning performance on benchmark datasets. Our code is available at https://github.com/D-Arnav/SCADA

Key Contributions

Identifies privacy risk in source-free domain adaptation where models leak source-exclusive class knowledge via zero-shot transfer
Proposes SCADA-UL, a new machine unlearning setting for domain adaptation with distribution shift
Adversarial optimization method with rescaled labeling strategy that achieves retraining-level unlearning performance

🛡️ Threat Analysis

Model Inversion Attack

The paper addresses a privacy leakage problem where source models inadvertently retain and leak sensitive source-domain specific information (class knowledge) into the target domain during domain adaptation. The threat model involves an adversary exploiting zero-shot transfer to infer knowledge about source-exclusive classes. The proposed unlearning method is explicitly evaluated against this data leakage threat.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

training_timeinference_time

Datasets

benchmark datasets mentioned but not specifically named in abstract

Applications

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

UnlearnShield: Shielding Forgotten Privacy against Unlearning Inversion

FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning

How Breakable Is Privacy: Probing and Resisting Model Inversion Attacks in Collaborative Inference

PrivDFS: Private Inference via Distributed Feature Sharing against Data Reconstruction Attacks

Privacy-Preserving Model Transcription with Differentially Private Synthetic Distillation

Model Inversion Attacks Meet Cryptographic Fuzzy Extractors

Efficient Unlearning through Maximizing Relearning Convergence Delay

Ensuring superior learning outcomes and data security for authorized learner