defense 2025

Adapter Shield: A Unified Framework with Built-in Authentication for Preventing Unauthorized Zero-Shot Image-to-Image Generation

Jun Jia , Hongyi Miao ¹, Yingjie Zhou ², Wangqiu Zhou ³, Jianbo Zhang ², Linhan Cao ², Dandan Zhu ⁴, Hua Yang ², Xiongkuo Min ², Wei Sun ⁴, Guangtao Zhai ²

¹ Shandong University

² Shanghai Jiao Tong University

³ Hefei University of Technology

⁴ East China Normal University

0 citations · arXiv

Published on arXiv

2512.00075

Input Manipulation Attack

OWASP ML Top 10 — ML01

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Adapter Shield surpasses existing state-of-the-art defenses in blocking unauthorized zero-shot image synthesis across identity cloning and style imitation while maintaining full generation quality for authenticated users.

Adapter Shield

Novel technique introduced

With the rapid progress in diffusion models, image synthesis has advanced to the stage of zero-shot image-to-image generation, where high-fidelity replication of facial identities or artistic styles can be achieved using just one portrait or artwork, without modifying any model weights. Although these techniques significantly enhance creative possibilities, they also pose substantial risks related to intellectual property violations, including unauthorized identity cloning and stylistic imitation. To counter such threats, this work presents Adapter Shield, the first universal and authentication-integrated solution aimed at defending personal images from misuse in zero-shot generation scenarios. We first investigate how current zero-shot methods employ image encoders to extract embeddings from input images, which are subsequently fed into the UNet of diffusion models through cross-attention layers. Inspired by this mechanism, we construct a reversible encryption system that maps original embeddings into distinct encrypted representations according to different secret keys. The authorized users can restore the authentic embeddings via a decryption module and the correct key, enabling normal usage for authorized generation tasks. For protection purposes, we design a multi-target adversarial perturbation method that actively shifts the original embeddings toward designated encrypted patterns. Consequently, protected images are embedded with a defensive layer that ensures unauthorized users can only produce distorted or encrypted outputs. Extensive evaluations demonstrate that our method surpasses existing state-of-the-art defenses in blocking unauthorized zero-shot image synthesis, while supporting flexible and secure access control for verified users.

Key Contributions

First universal authentication-integrated defense against zero-shot image-to-image generation misuse, targeting the embedding space of image encoders feeding into diffusion UNets via cross-attention.
Reversible encryption system mapping embeddings to distinct encrypted representations keyed per user, enabling authorized decryption while producing distorted outputs for unauthorized access.
Multi-target adversarial perturbation method that actively shifts image embeddings toward designated encrypted patterns, surpassing existing state-of-the-art defenses on identity cloning and style imitation tasks.

🛡️ Threat Analysis

Input Manipulation Attack

Core mechanism is a multi-target adversarial perturbation method that shifts image encoder embeddings at inference time, causing diffusion models to produce distorted or encrypted outputs for unauthorized users — this is an adversarial input manipulation used as a defensive tool against generation pipelines.

Output Integrity Attack

The overarching security goal is protecting content (facial identities, artistic styles) from unauthorized AI-generated replication — a content integrity and provenance concern. The paper also builds reversible encryption/authentication into the protection scheme to control who can generate from protected images.

Details

Domains

visiongenerative

Model Types

diffusiontransformer

Threat Tags

white_boxinference_timedigital

Datasets

FFHQWikiArt

Applications

zero-shot image-to-image generationfacial identity protectionartistic style protection

Read PDF arXiv DOI

Adapter Shield: A Unified Framework with Built-in Authentication for Preventing Unauthorized Zero-Shot Image-to-Image Generation

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Safeguarding Facial Identity against Diffusion-based Face Swapping via Cascading Pathway Disruption

EraseFlow: Learning Concept Erasure Policies via GFlowNet-Driven Alignment

Now You See It, Now You Don't - Instant Concept Erasure for Safe Text-to-Image and Video Generation

Towards Transferable Defense Against Malicious Image Edits

Creating Blank Canvas Against AI-enabled Image Forgery

DeContext as Defense: Safe Image Editing in Diffusion Transformers

CLUE: Leveraging Low-Rank Adaptation to Capture Latent Uncovered Evidence for Image Forgery Localization

Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity