Disrupting Hierarchical Reasoning: Adversarial Protection for Geographic Privacy in Multimodal Reasoning Models
Jiaming Zhang 1, Che Wang 1,2, Yang Cao 3, Longtao Huang 4, Wei Yang Bryan Lim 1
Published on arXiv
2512.08503
Input Manipulation Attack
OWASP ML Top 10 — ML01
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
ReasonBreak achieves 33.8% tract-level protection vs. 19.4% for the strongest baseline and nearly doubles block-level protection to 33.5% vs. 16.8% across seven leading MLRMs including GPT-o3 and Gemini 2.5 Pro.
ReasonBreak
Novel technique introduced
Multi-modal large reasoning models (MLRMs) pose significant privacy risks by inferring precise geographic locations from personal images through hierarchical chain-of-thought reasoning. Existing privacy protection techniques, primarily designed for perception-based models, prove ineffective against MLRMs' sophisticated multi-step reasoning processes that analyze environmental cues. We introduce \textbf{ReasonBreak}, a novel adversarial framework specifically designed to disrupt hierarchical reasoning in MLRMs through concept-aware perturbations. Our approach is founded on the key insight that effective disruption of geographic reasoning requires perturbations aligned with conceptual hierarchies rather than uniform noise. ReasonBreak strategically targets critical conceptual dependencies within reasoning chains, generating perturbations that invalidate specific inference steps and cascade through subsequent reasoning stages. To facilitate this approach, we contribute \textbf{GeoPrivacy-6K}, a comprehensive dataset comprising 6,341 ultra-high-resolution images ($\geq$2K) with hierarchical concept annotations. Extensive evaluation across seven state-of-the-art MLRMs (including GPT-o3, GPT-5, Gemini 2.5 Pro) demonstrates ReasonBreak's superior effectiveness, achieving a 14.4\% improvement in tract-level protection (33.8\% vs 19.4\%) and nearly doubling block-level protection (33.5\% vs 16.8\%). This work establishes a new paradigm for privacy protection against reasoning-based threats.
Key Contributions
- ReasonBreak: a concept-aware adversarial perturbation framework that targets critical visual concepts within MLRM chain-of-thought reasoning chains, causing cascading disruption of hierarchical geographic inference
- GeoPrivacy-6K: a dataset of 6,341 ultra-high-resolution images (≥2K) with three-level hierarchical geographic concept annotations and spatially-localized bounding boxes for reasoning-aware defense research
- Empirical validation across seven state-of-the-art MLRMs (GPT-o3, GPT-5, Gemini 2.5 Pro, etc.) achieving 33.8% tract-level and 33.5% block-level protection, substantially outperforming prior baselines
🛡️ Threat Analysis
ReasonBreak generates adversarial perturbations applied to images at inference time to manipulate (disrupt) VLM outputs — specifically the hierarchical geographic reasoning chain-of-thought. This is adversarial image perturbation used defensively against VLM inference, explicitly categorized as ML01 per the adversarial visual inputs to VLMs rule.