defense 2025

DeContext as Defense: Safe Image Editing in Diffusion Transformers

Linghui Shen , Mingyue Cui , Xingyi Yang

0 citations · 61 references · arXiv

α

Published on arXiv

2512.16625

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

DeContext consistently blocks unauthorized in-context image edits on Flux Kontext and Step1X-Edit by disrupting cross-attention pathways with imperceptible perturbations while retaining visual quality.

DeContext

Novel technique introduced


In-context diffusion models allow users to modify images with remarkable ease and realism. However, the same power raises serious privacy concerns: personal images can be easily manipulated for identity impersonation, misinformation, or other malicious uses, all without the owner's consent. While prior work has explored input perturbations to protect against misuse in personalized text-to-image generation, the robustness of modern, large-scale in-context DiT-based models remains largely unexamined. In this paper, we propose DeContext, a new method to safeguard input images from unauthorized in-context editing. Our key insight is that contextual information from the source image propagates to the output primarily through multimodal attention layers. By injecting small, targeted perturbations that weaken these cross-attention pathways, DeContext breaks this flow, effectively decouples the link between input and output. This simple defense is both efficient and robust. We further show that early denoising steps and specific transformer blocks dominate context propagation, which allows us to concentrate perturbations where they matter most. Experiments on Flux Kontext and Step1X-Edit show that DeContext consistently blocks unwanted image edits while preserving visual quality. These results highlight the effectiveness of attention-based perturbations as a powerful defense against image manipulation. Code is available at https://github.com/LinghuiiShen/DeContext.


Key Contributions

  • DeContext: a method that injects imperceptible adversarial perturbations into images to break cross-attention pathways in in-context diffusion transformers, blocking unauthorized identity-preserving edits
  • Analysis demonstrating that early denoising steps and specific transformer blocks dominate context propagation, enabling concentrated and efficient perturbations
  • Empirical validation on Flux Kontext and Step1X-Edit showing consistent protection against neutral, violent, sexual, and misleading edits while preserving visual quality

🛡️ Threat Analysis

Input Manipulation Attack

The paper's core contribution is crafting adversarial perturbations that cause in-context diffusion models to fail at inference time — a defense-side application of adversarial input manipulation. The perturbations target multimodal cross-attention layers to break context propagation from source to output image, which is mechanistically an adversarial perturbation defense against a specific inference-time threat.


Details

Domains
visiongenerative
Model Types
diffusiontransformer
Threat Tags
white_boxinference_timetargeteddigital
Applications
image editing protectionidentity impersonation preventioncontent manipulation defense