Alice Dethise

h-index: 2 4 citations 4 papers (total)

Papers in Database (1)

defense arXiv Sep 29, 2025 · Sep 2025

Defeating Cerberus: Concept-Guided Privacy-Leakage Mitigation in Multimodal Language Models

Boyang Zhang, Istemi Ekin Akkus, Ruichuan Chen et al. · CISPA Helmholtz Center for Information Security · Nokia Bell Labs

Concept-guided weight editing prevents VLMs from leaking or processing PII with 93.3% refusal rate and no retraining needed

Sensitive Information Disclosure visionnlpmultimodal
PDF