α

Published on arXiv

2511.12575

Input Manipulation Attack

OWASP ML Top 10 — ML01

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Two-stage typographic attack significantly reduces geolocation prediction accuracy across five state-of-the-art commercial LVLMs (GPT, Claude, Gemini, Qwen, o3) on three datasets while leaving the original image visually unmodified.

Semantic-aware Typographic Attack

Novel technique introduced


Large Visual Language Models (LVLMs) now pose a serious yet overlooked privacy threat, as they can infer a social media user's geolocation directly from shared images, leading to unintended privacy leakage. While adversarial image perturbations provide a potential direction for geo-privacy protection, they require relatively strong distortions to be effective against LVLMs, which noticeably degrade visual quality and diminish an image's value for sharing. To overcome this limitation, we identify typographical attacks as a promising direction for protecting geo-privacy by adding text extension outside the visual content. We further investigate which textual semantics are effective in disrupting geolocation inference and design a two-stage, semantics-aware typographical attack that generates deceptive text to protect user privacy. Extensive experiments across three datasets demonstrate that our approach significantly reduces geolocation prediction accuracy of five state-of-the-art commercial LVLMs, establishing a practical and visually-preserving protection strategy against emerging geo-privacy threats.


Key Contributions

  • Identifies typographic attacks as a visually non-destructive alternative to pixel-level adversarial perturbations for protecting geo-privacy against LVLMs, preserving image sharing value.
  • Systematically investigates which textual semantics (location plausibility, instructional format, explanatory reinforcement) most effectively disrupt LVLM geolocation inference.
  • Designs a two-stage, feedback-guided framework generating a deceptive instructional sentence followed by model-feedback-conditioned explanatory text to increase attack credibility and stability.

🛡️ Threat Analysis

Input Manipulation Attack

Proposes adversarial typographic inputs to VLMs at inference time — strategically crafted text added to image borders causes commercial LVLMs to produce incorrect geolocation outputs, constituting adversarial visual input manipulation against VLM systems.


Details

Domains
visionmultimodal
Model Types
vlmmultimodal
Threat Tags
black_boxinference_timetargeted
Applications
geolocation inferencesocial media image privacyvisual question answering