attack 2026

Unsafe by Reciprocity: How Generation-Understanding Coupling Undermines Safety in Unified Multimodal Models

Kaishen Wang , Heng Huang

0 citations

α

Published on arXiv

2603.27332

Input Manipulation Attack

OWASP ML Top 10 — ML01

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Achieves high Attack Success Rates in both G→U and U→G pathways, revealing that unsafe intermediate signals propagate across modalities in tightly coupled UMMs

RICE

Novel technique introduced


Recent advances in Large Language Models (LLMs) and Text-to-Image (T2I) models have led to the emergence of Unified Multimodal Models (UMMs), where multimodal understanding and image generation are tightly integrated within a shared architecture. Prior studies suggest that such reciprocity enhances cross-functionality performance through shared representations and joint optimization. However, the safety implications of this tight coupling remain largely unexplored, as existing safety research predominantly analyzes understanding and generation functionalities in isolation. In this work, we investigate whether cross-functionality reciprocity itself constitutes a structural source of vulnerability in UMMs. We propose RICE: Reciprocal Interaction-based Cross-functionality Exploitation, a novel attack paradigm that explicitly exploits bidirectional interactions between understanding and generation. Using this framework, we systematically evaluate Generation-to-Understanding (G-U) and Understanding-to-Generation (U-G) attack pathways, demonstrating that unsafe intermediate signals can propagate across modalities and amplify safety risks. Extensive experiments show high Attack Success Rates (ASR) in both directions, revealing previously overlooked safety weaknesses inherent to UMMs.


Key Contributions

  • Novel RICE attack paradigm that exploits bidirectional interactions between understanding and generation in unified multimodal models
  • Systematic evaluation of Generation-to-Understanding (G→U) and Understanding-to-Generation (U→G) attack pathways
  • Demonstrates that cross-functionality reciprocity constitutes a structural vulnerability in UMMs, achieving high attack success rates in both directions

🛡️ Threat Analysis

Input Manipulation Attack

Exploits cross-modality interactions to manipulate model behavior at inference time, causing unsafe outputs through intermediate signal propagation across understanding and generation pathways.


Details

Domains
multimodalvisiongenerative
Model Types
vlmmultimodaltransformerdiffusion
Threat Tags
inference_timetargeteddigital
Applications
multimodal understandingtext-to-image generationunified multimodal systems