Are Neuro-Inspired Multi-Modal Vision-Language Models Resilient to Membership Inference Privacy Leakage?
David Amebley 1, Sayanton Dibbo 1,2
Published on arXiv
2511.20710
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
Topological regularization (τ > 0) reduces MIA attack success by ~24% mean ROC-AUC on BLIP/COCO while maintaining comparable caption quality under MPNet and ROUGE-2 metrics.
τ-regularization (Neuroscience-Inspired Topological Regularization)
Novel technique introduced
In the age of agentic AI, the growing deployment of multi-modal models (MMs) has introduced new attack vectors that can leak sensitive training data in MMs, causing privacy leakage. This paper investigates a black-box privacy attack, i.e., membership inference attack (MIA) on multi-modal vision-language models (VLMs). State-of-the-art research analyzes privacy attacks primarily to unimodal AI-ML systems, while recent studies indicate MMs can also be vulnerable to privacy attacks. While researchers have demonstrated that biologically inspired neural network representations can improve unimodal model resilience against adversarial attacks, it remains unexplored whether neuro-inspired MMs are resilient against privacy attacks. In this work, we introduce a systematic neuroscience-inspired topological regularization (tau) framework to analyze MM VLMs resilience against image-text-based inference privacy attacks. We examine this phenomenon using three VLMs: BLIP, PaliGemma 2, and ViT-GPT2, across three benchmark datasets: COCO, CC3M, and NoCaps. Our experiments compare the resilience of baseline and neuro VLMs (with topological regularization), where the tau > 0 configuration defines the NEURO variant of VLM. Our results on the BLIP model using the COCO dataset illustrate that MIA attack success in NEURO VLMs drops by 24% mean ROC-AUC, while achieving similar model utility (similarities between generated and reference captions) in terms of MPNet and ROUGE-2 metrics. This shows neuro VLMs are comparatively more resilient against privacy attacks, while not significantly compromising model utility. Our extensive evaluation with PaliGemma 2 and ViT-GPT2 models, on two additional datasets: CC3M and NoCaps, further validates the consistency of the findings. This work contributes to the growing understanding of privacy risks in MMs and provides evidence on neuro VLMs privacy threat resilience.
Key Contributions
- Introduces neuroscience-inspired topological regularization (τ-regularization) as a defense mechanism against MIA in multi-modal VLMs
- First systematic study of MIA resilience in multi-modal VLMs (BLIP, PaliGemma 2, ViT-GPT2) across three benchmark datasets
- Demonstrates ~24% reduction in MIA attack success (mean ROC-AUC) with minimal degradation in model utility (MPNet and ROUGE-2 metrics)
🛡️ Threat Analysis
The paper's primary focus is membership inference attacks (MIA) on VLMs — the binary question of whether specific image-text pairs were in training data — and proposes τ-regularization as a defense evaluated directly against MIA success (ROC-AUC metric).