defense 2025

Beyond Pixels: Semantic-aware Typographic Attack for Geo-Privacy Protection

0 citations · arXiv

Published on arXiv

2511.12575

Input Manipulation Attack

OWASP ML Top 10 — ML01

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Two-stage typographic attack significantly reduces geolocation prediction accuracy across five state-of-the-art commercial LVLMs (GPT, Claude, Gemini, Qwen, o3) on three datasets while leaving the original image visually unmodified.

Semantic-aware Typographic Attack

Novel technique introduced

Large Visual Language Models (LVLMs) now pose a serious yet overlooked privacy threat, as they can infer a social media user's geolocation directly from shared images, leading to unintended privacy leakage. While adversarial image perturbations provide a potential direction for geo-privacy protection, they require relatively strong distortions to be effective against LVLMs, which noticeably degrade visual quality and diminish an image's value for sharing. To overcome this limitation, we identify typographical attacks as a promising direction for protecting geo-privacy by adding text extension outside the visual content. We further investigate which textual semantics are effective in disrupting geolocation inference and design a two-stage, semantics-aware typographical attack that generates deceptive text to protect user privacy. Extensive experiments across three datasets demonstrate that our approach significantly reduces geolocation prediction accuracy of five state-of-the-art commercial LVLMs, establishing a practical and visually-preserving protection strategy against emerging geo-privacy threats.

Key Contributions

Identifies typographic attacks as a visually non-destructive alternative to pixel-level adversarial perturbations for protecting geo-privacy against LVLMs, preserving image sharing value.
Systematically investigates which textual semantics (location plausibility, instructional format, explanatory reinforcement) most effectively disrupt LVLM geolocation inference.
Designs a two-stage, feedback-guided framework generating a deceptive instructional sentence followed by model-feedback-conditioned explanatory text to increase attack credibility and stability.

🛡️ Threat Analysis

Input Manipulation Attack

Proposes adversarial typographic inputs to VLMs at inference time — strategically crafted text added to image borders causes commercial LVLMs to produce incorrect geolocation outputs, constituting adversarial visual input manipulation against VLM systems.

Details

Domains

visionmultimodal

Model Types

vlmmultimodal

Threat Tags

black_boxinference_timetargeted

Applications

geolocation inferencesocial media image privacyvisual question answering

Read PDF arXiv DOI

Beyond Pixels: Semantic-aware Typographic Attack for Geo-Privacy Protection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Towards Mechanistic Defenses Against Typographic Attacks in CLIP

ADVEDM:Fine-grained Adversarial Attack against VLM-based Embodied Agents

GeoShield: Safeguarding Geolocation Privacy from Vision-Language Models via Adversarial Perturbations

Principled Steering via Null-space Projection for Jailbreak Defense in Vision-Language Models

ORCA: Agentic Reasoning For Hallucination and Adversarial Robustness in Vision-Language Models

GuardAlign: Test-time Safety Alignment in Multimodal Large Language Models

Extended to Reality: Prompt Injection in 3D Environments

Reimagining Safety Alignment with An Image