attack 2026

Off-The-Shelf Image-to-Image Models Are All You Need To Defeat Image Protection Schemes

Xavier Pleimling ¹, Sifat Muhammad Abdullah ¹, Gunjan Balde ², Peng Gao ¹, Mainack Mondal ², Murtuza Jadliwala ³, Bimal Viswanath ¹

¹ Virginia Tech

² IIT Kharagpur

³ University of Texas at San Antonio

0 citations · 141 references · arXiv (Cornell University)

Published on arXiv

2602.22197

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Off-the-shelf image-to-image models used as text-prompted denoisers outperform existing specialized attacks across all 8 case studies, defeating 6 diverse protective perturbation schemes without any purpose-built attack engineering.

img2img denoiser attack

Novel technique introduced

Advances in Generative AI (GenAI) have led to the development of various protection strategies to prevent the unauthorized use of images. These methods rely on adding imperceptible protective perturbations to images to thwart misuse such as style mimicry or deepfake manipulations. Although previous attacks on these protections required specialized, purpose-built methods, we demonstrate that this is no longer necessary. We show that off-the-shelf image-to-image GenAI models can be repurposed as generic ``denoisers" using a simple text prompt, effectively removing a wide range of protective perturbations. Across 8 case studies spanning 6 diverse protection schemes, our general-purpose attack not only circumvents these defenses but also outperforms existing specialized attacks while preserving the image's utility for the adversary. Our findings reveal a critical and widespread vulnerability in the current landscape of image protection, indicating that many schemes provide a false sense of security. We stress the urgent need to develop robust defenses and establish that any future protection mechanism must be benchmarked against attacks from off-the-shelf GenAI models. Code is available in this repository: https://github.com/mlsecviswanath/img2imgdenoiser

Key Contributions

Shows that off-the-shelf image-to-image generative models can be repurposed as generic denoisers via simple text prompts, requiring no attack-specific engineering
Evaluates the approach across 8 case studies spanning 6 diverse image protection schemes (style mimicry and deepfake prevention), outperforming existing specialized attacks while preserving image utility
Establishes a new baseline requirement: future image protection mechanisms must be benchmarked against off-the-shelf GenAI model attacks

🛡️ Threat Analysis

Input Manipulation Attack

The protective schemes (Glaze, Anti-DreamBooth, etc.) rely on adversarial perturbations added to images as defenses. This paper attacks these defenses at inference time by repurposing image-to-image generative models as generic purification/denoising pipelines that strip the protective perturbations — directly targeting adversarial perturbation-based defense mechanisms.

Details

Domains

visiongenerative

Model Types

diffusion

Threat Tags

black_boxinference_timedigital

Applications

image protection schemesstyle mimicry preventionanti-deepfake protection

Read PDF arXiv DOI Code

Off-The-Shelf Image-to-Image Models Are All You Need To Defeat Image Protection Schemes

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

CtrlAttack: A Unified Attack on World-Model Control in Diffusion Models

TAIGen: Training-Free Adversarial Image Generation via Diffusion Models

Arc2Morph: Identity-Preserving Facial Morphing with Arc2Face

DPAC: Distribution-Preserving Adversarial Control for Diffusion Sampling

When World Models Dream Wrong: Physical-Conditioned Adversarial Attacks against World Models

The Illusion of Forgetting: Attack Unlearned Diffusion via Initial Latent Variable Optimization

AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts

REFORGE: Multi-modal Attacks Reveal Vulnerable Concept Unlearning in Image Generation Models