defense arXiv Apr 8, 2026 · 6w ago
Yifan Zhu, Yihan Wang, Xiao-Shan Gao · Chinese Academy of Sciences · University of Chinese Academy of Sciences +1 more
Instance-specific watermarking defense for diffusion models resisting removal and forgery attacks via dynamic injection and two-sided detection
Output Integrity Attack visiongenerative
Generated contents have raised serious concerns about copyright protection, image provenance, and credit attribution. A potential solution for these problems is watermarking. Recently, content watermarking for text-to-image diffusion models has been studied extensively for its effective detection utility and robustness. However, these watermarking techniques are vulnerable to potential adversarial attacks, such as removal attacks and forgery attacks. In this paper, we build a novel watermarking paradigm called Instance-Specific watermarking with Two-Sided detection (ISTS) to resist removal and forgery attacks. Specifically, we introduce a strategy that dynamically controls the injection time and watermarking patterns based on the semantics of users' prompts. Furthermore, we propose a new two-sided detection approach to enhance robustness in watermark detection. Experiments have demonstrated the superiority of our watermarking against removal and forgery attacks.
diffusion Chinese Academy of Sciences · University of Chinese Academy of Sciences · University of Waterloo
defense arXiv Mar 4, 2026 · 11w ago
Yifan Zhu, Yibo Miao, Yinpeng Dong et al. · Chinese Academy of Sciences · University of Chinese Academy of Sciences +2 more
Proposes MI-UE, a theoretically grounded availability-poisoning defense that blocks unauthorized model training by reducing mutual information in poisoned image features
Data Poisoning Attack vision
The volume of freely scraped data on the Internet has driven the tremendous success of deep learning. Along with this comes the growing concern about data privacy and security. Numerous methods for generating unlearnable examples have been proposed to prevent data from being illicitly learned by unauthorized deep models by impeding generalization. However, the existing approaches primarily rely on empirical heuristics, making it challenging to enhance unlearnable examples with solid explanations. In this paper, we analyze and improve unlearnable examples from a novel perspective: mutual information reduction. We demonstrate that effective unlearnable examples always decrease mutual information between clean features and poisoned features, and when the network gets deeper, the unlearnability goes better together with lower mutual information. Further, we prove from a covariance reduction perspective that minimizing the conditional covariance of intra-class poisoned features reduces the mutual information between distributions. Based on the theoretical results, we propose a novel unlearnable method called Mutual Information Unlearnable Examples (MI-UE) that reduces covariance by maximizing the cosine similarity among intra-class features, thus impeding the generalization effectively. Extensive experiments demonstrate that our approach significantly outperforms the previous methods, even under defense mechanisms.
cnn transformer Chinese Academy of Sciences · University of Chinese Academy of Sciences · Tsinghua University +1 more