defense 2025

DualTAP: A Dual-Task Adversarial Protector for Mobile MLLM Agents

Fuyao Zhang 1, Jiaming Zhang 1, Che Wang 1,2, Xiongtao Sun 1,3, Yurong Hao 1, Guowei Guan 1, Wenjie Li 4, Longtao Huang 5, Wei Yang Bryan Lim

2 citations · 1 influential · 36 references · arXiv

α

Published on arXiv

2511.13248

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Reduces average PII leakage rate by 31.6 percentage points (3.0x relative improvement) across six MLLMs while maintaining 80.8% task success rate versus an 83.6% unprotected baseline.

DualTAP

Novel technique introduced


The reliance of mobile GUI agents on Multimodal Large Language Models (MLLMs) introduces a severe privacy vulnerability: screenshots containing Personally Identifiable Information (PII) are often sent to untrusted, third-party routers. These routers can exploit their own MLLMs to mine this data, violating user privacy. Existing privacy perturbations fail the critical dual challenge of this scenario: protecting PII from the router's MLLM while simultaneously preserving task utility for the agent's MLLM. To address this gap, we propose the Dual-Task Adversarial Protector (DualTAP), a novel framework that, for the first time, explicitly decouples these conflicting objectives. DualTAP trains a lightweight generator using two key innovations: (i) a contrastive attention module that precisely identifies and targets only the PII-sensitive regions, and (ii) a dual-task adversarial objective that simultaneously minimizes a task-preservation loss (to maintain agent utility) and a privacy-interference loss (to suppress PII leakage). To facilitate this study, we introduce PrivScreen, a new dataset of annotated mobile screenshots designed specifically for this dual-task evaluation. Comprehensive experiments on six diverse MLLMs (e.g., GPT-5) demonstrate DualTAP's state-of-the-art protection. It reduces the average privacy leakage rate by 31.6 percentage points (a 3.0x relative improvement) while, critically, maintaining an 80.8% task success rate - a negligible drop from the 83.6% unprotected baseline. DualTAP presents the first viable solution to the privacy-utility trade-off in mobile MLLM agents.


Key Contributions

  • DualTAP framework that decouples conflicting privacy-protection and task-preservation objectives via a dual-task adversarial training objective
  • Contrastive attention module that precisely targets PII-sensitive regions in screenshots for perturbation, minimizing impact on task-relevant content
  • PrivScreen dataset: annotated mobile screenshots enabling dual-task evaluation of privacy-utility trade-offs in MLLM agent scenarios

🛡️ Threat Analysis

Input Manipulation Attack

DualTAP's core contribution is a trained generator that adds adversarial perturbations to visual inputs (screenshots) to degrade a VLM's ability to extract PII at inference time — a classic adversarial input defense. Per the spec, adversarial perturbations that prevent VLM inference from user-provided visual inputs are ML01, not LLM06.


Details

Domains
visionmultimodal
Model Types
vlmmultimodal
Threat Tags
white_boxinference_timedigitaltargeted
Datasets
PrivScreen
Applications
mobile gui agentsmllm-based screenshot processingvisual pii protection