defense 2025

AuthenLoRA: Entangling Stylization with Imperceptible Watermarks for Copyright-Secure LoRA Adapters

Fangming Shi ¹, Li Li ¹, Kejiang Chen ², Guorui Feng ¹, Xinpeng Zhang ^1,3

¹ Shanghai University

² University of Science and Technology of China

³ Fudan University

0 citations · 46 references · arXiv

Published on arXiv

2511.21216

Output Integrity Attack

OWASP ML Top 10 — ML09

Model Theft

OWASP ML Top 10 — ML05

Key Finding

AuthenLoRA achieves high-fidelity stylization with robust watermark propagation to generated images and significantly lower false-positive rates than existing LoRA watermarking approaches.

AuthenLoRA

Novel technique introduced

Low-Rank Adaptation (LoRA) offers an efficient paradigm for customizing diffusion models, but its ease of redistribution raises concerns over unauthorized use and the generation of untraceable content. Existing watermarking techniques either target base models or verify LoRA modules themselves, yet they fail to propagate watermarks to generated images, leaving a critical gap in traceability. Moreover, traceability watermarking designed for base models is not tightly coupled with stylization and often introduces visual degradation or high false-positive detection rates. To address these limitations, we propose AuthenLoRA, a unified watermarking framework that embeds imperceptible, traceable watermarks directly into the LoRA training process while preserving stylization quality. AuthenLoRA employs a dual-objective optimization strategy that jointly learns the target style distribution and the watermark-induced distribution shift, ensuring that any image generated with the watermarked LoRA reliably carries the watermark. We further design an expanded LoRA architecture for enhanced multi-scale adaptation and introduce a zero-message regularization mechanism that substantially reduces false positives during watermark verification. Extensive experiments demonstrate that AuthenLoRA achieves high-fidelity stylization, robust watermark propagation, and significantly lower false-positive rates compared with existing approaches. Open-source implementation is available at: https://github.com/ShiFangming0823/AuthenLoRA

Key Contributions

Dual-objective LoRA training framework that jointly learns target style distribution and watermark-induced distribution shift, ensuring every generated image carries the embedded watermark
Expanded LoRA architecture with multi-scale adaptation and extended ResNet block fine-tuning scope to mitigate conflicts between stylization quality and watermark embedding
Zero-message regularization mechanism that substantially reduces false-positive rates during watermark verification compared to existing approaches

🛡️ Threat Analysis

Model Theft

Secondary contribution is protecting the LoRA adapter as intellectual property — the watermarking mechanism allows owners to prove unauthorized redistribution of their LoRA module by verifying generated images. The paper explicitly addresses LoRA copyright enforcement and unauthorized commercial exploitation of the adapter.

Output Integrity Attack

Primary contribution is propagating watermarks from LoRA model parameters into generated IMAGES for content traceability — this is output-level watermarking for provenance authentication. The critical gap they fill over prior work (e.g., BlackboxLoRA) is that watermarks appear in the generated content, not just the model itself.

Details

Domains

visiongenerative

Model Types

diffusion

Threat Tags

training_time

Applications

text-to-image generationstyle transferlora adapter copyright protection

Read PDF arXiv DOI Code

AuthenLoRA: Entangling Stylization with Imperceptible Watermarks for Copyright-Secure LoRA Adapters

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Bi-Erasing: A Bidirectional Framework for Concept Removal in Diffusion Models

PoseGuard: Pose-Guided Generation with Safety Guardrails

Towards Irreversible Machine Unlearning for Diffusion Models

SuMa: A Subspace Mapping Approach for Robust and Effective Concept Erasure in Text-to-Image Diffusion Models

HMARK: Radioactive Multi-Bit Semantic-Latent Watermarking for Diffusion Models

SPDMark: Selective Parameter Displacement for Robust Video Watermarking

RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting

Universal Adversarial Purification with DDIM Metric Loss for Stable Diffusion