benchmark 2026

SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning

Cai Selvas-Sala 1,2,3, Lei Kang 1,3, Lluis Gomez 1,3

0 citations

α

Published on arXiv

2603.26316

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

Reveals that current unlearning methods either fail to forget effectively or over-generalize by erasing more than intended associations

SALMUBench

Novel technique introduced


As multimodal models like CLIP become integral to downstream systems, the need to remove sensitive information is critical. However, machine unlearning for contrastively-trained encoders remains underexplored, and existing evaluations fail to diagnose fine-grained, association-level forgetting. We introduce SALMUBench (Sensitive Association-Level Multimodal Unlearning), a benchmark built upon a synthetic dataset of 60K persona-attribute associations and two foundational models: a Compromised model polluted with this data, and a Clean model without it. To isolate unlearning effects, both are trained from scratch on the same 400M-pair retain base, with the Compromised model additionally trained on the sensitive set. We propose a novel evaluation protocol with structured holdout sets (holdout identity, holdout association) to precisely measure unlearning efficacy and collateral damage. Our benchmark reveals that while utility-efficient deletion is feasible, current methods exhibit distinct failure modes: they either fail to forget effectively or over-generalize by erasing more than intended. SALMUBench sets a new standard for comprehensive unlearning evaluation, and we publicly release our dataset, models, evaluation scripts, and leaderboards to foster future research.


Key Contributions

  • First benchmark for association-level unlearning in contrastively-trained multimodal models
  • Structured evaluation protocol with holdout sets to isolate forgetting efficacy from collateral damage
  • Synthetic dataset of 60K persona-attribute associations with Compromised and Clean baseline models

🛡️ Threat Analysis

Membership Inference Attack

While the paper frames unlearning as removing sensitive persona-attribute associations, the evaluation methodology measures whether unlearned associations can still be inferred (analogous to membership inference on associations). The benchmark assesses privacy risks of association retention, which aligns with membership-style inference attacks on whether specific training associations remain in the model.


Details

Domains
multimodalvisionnlp
Model Types
multimodaltransformer
Threat Tags
training_time
Datasets
SALMUBench (60K synthetic persona-attribute associations)400M-pair retain set
Applications
multimodal encodersvision-language models