defense 2025

StyleProtect: Safeguarding Artistic Identity in Fine-tuned Diffusion Models

Qiuyu Tang , Joshua Krinsky , Aparna Bharati

0 citations

α

Published on arXiv

2509.13711

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Targeted perturbations restricted to style-sensitive cross-attention layers achieve competitive style protection against fine-tuned diffusion models while maintaining imperceptibility on artworks from 30 artists and anime imagery.

StyleProtect

Novel technique introduced


The rapid advancement of generative models, particularly diffusion-based approaches, has inadvertently facilitated their potential for misuse. Such models enable malicious exploiters to replicate artistic styles that capture an artist's creative labor, personal vision, and years of dedication in an inexpensive manner. This has led to a rise in the need and exploration of methods for protecting artworks against style mimicry. Although generic diffusion models can easily mimic an artistic style, finetuning amplifies this capability, enabling the model to internalize and reproduce the style with higher fidelity and control. We hypothesize that certain cross-attention layers exhibit heightened sensitivity to artistic styles. Sensitivity is measured through activation strengths of attention layers in response to style and content representations, and assessing their correlations with features extracted from external models. Based on our findings, we introduce an efficient and lightweight protection strategy, StyleProtect, that achieves effective style defense against fine-tuned diffusion models by updating only selected cross-attention layers. Our experiments utilize a carefully curated artwork dataset based on WikiArt, comprising representative works from 30 artists known for their distinctive and influential styles and cartoon animations from the Anita dataset. The proposed method demonstrates promising performance in safeguarding unique styles of artworks and anime from malicious diffusion customization, while maintaining competitive imperceptibility.


Key Contributions

  • Analysis of cross-attention layer sensitivity to artistic style using cosine similarity with CSD embeddings, identifying layers most correlated with style representations
  • Lightweight protection strategy (StyleProtect) that applies adversarial perturbations by updating only style-sensitive cross-attention layers rather than all parameters
  • Curated benchmark dataset of 30 stylistically distinctive artists from WikiArt plus the Anita anime dataset for evaluating style mimicry and protection

🛡️ Threat Analysis

Output Integrity Attack

StyleProtect embeds protective adversarial perturbations into artistic content (images) to prevent AI-based style mimicry — this is content integrity/provenance protection, analogous to anti-deepfake perturbation schemes (e.g., Glaze, Anti-DreamBooth). The paper protects the authenticity and integrity of artistic content against unauthorized AI replication, fitting ML09's scope of output integrity and content protection mechanisms.


Details

Domains
visiongenerative
Model Types
diffusiontransformer
Threat Tags
training_timedigitalwhite_box
Datasets
WikiArtAnita
Applications
artistic style protectionartist ip protectiondiffusion model customization defense