defense 2025

BlurGuard: A Simple Approach for Robustifying Image Protection Against AI-Powered Editing

Jinsu Kim 1, Yunhun Nam 1, Minseon Kim 2, Sangpil Kim 1, Jongheon Jeong 1

0 citations · 101 references · arXiv

α

Published on arXiv

2511.00143

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Adaptive per-region Gaussian blur on adversarial protective noise consistently improves worst-case protection against a wide range of reversal techniques (e.g., JPEG compression) while reducing noise-induced quality degradation.

BlurGuard

Novel technique introduced


Recent advances in text-to-image models have increased the exposure of powerful image editing techniques as a tool, raising concerns about their potential for malicious use. An emerging line of research to address such threats focuses on implanting "protective" adversarial noise into images before their public release, so future attempts to edit them using text-to-image models can be impeded. However, subsequent works have shown that these adversarial noises are often easily "reversed," e.g., with techniques as simple as JPEG compression, casting doubt on the practicality of the approach. In this paper, we argue that adversarial noise for image protection should not only be imperceptible, as has been a primary focus of prior work, but also irreversible, viz., it should be difficult to detect as noise provided that the original image is hidden. We propose a surprisingly simple method to enhance the robustness of image protection methods against noise reversal techniques. Specifically, it applies an adaptive per-region Gaussian blur on the noise to adjust the overall frequency spectrum. Through extensive experiments, we show that our method consistently improves the per-sample worst-case protection performance of existing methods against a wide range of reversal techniques on diverse image editing scenarios, while also reducing quality degradation due to noise in terms of perceptual metrics. Code is available at https://github.com/jsu-kim/BlurGuard.


Key Contributions

  • Identifies that adversarial image protections must be not only imperceptible but also irreversible (hard to detect as noise without the original image)
  • Proposes BlurGuard: adaptive per-region Gaussian blur applied to protective adversarial noise to reshape its frequency spectrum, improving robustness against reversal techniques
  • Demonstrates consistent improvement in worst-case per-sample protection performance across diverse image editing scenarios and reversal methods, while also reducing perceptual quality degradation

🛡️ Threat Analysis

Output Integrity Attack

The paper directly addresses output integrity: it strengthens content protection schemes (adversarial noise) against reversal/removal attacks (JPEG compression, denoising) that defeat image protections. Per the taxonomy, attacking or defending protective image perturbations (anti-editing, anti-deepfake noise) falls under ML09, not ML01, because the threat is removal of content protection, not crafting adversarial examples that cause misclassification.


Details

Domains
visiongenerative
Model Types
diffusion
Threat Tags
digitalinference_time
Applications
image protection against ai-powered editingtext-to-image model manipulation prevention