defense arXiv Jan 11, 2026 · 12w ago
Qingyu Liu, Zhongjie Ba, Jianmin Guo et al. · Zhejiang University · Huawei
Proposes efficient reconstruction-based AIGC detector covering GANs, VAEs, and diffusion models with 22x speedup over prior methods
Output Integrity Attack visiongenerative
Recently, reconstruction-based methods have gained attention for AIGC image detection. These methods leverage pre-trained diffusion models to reconstruct inputs and measure residuals for distinguishing real from fake images. Their key advantage lies in reducing reliance on dataset-specific artifacts and improving generalization under distribution shifts. However, they are limited by significant inefficiency due to multi-step inversion and reconstruction, and their reliance on diffusion backbones further limits generalization to other generative paradigms such as GANs. In this paper, we propose a novel fake image detection framework, called R$^2$BD, built upon two key designs: (1) G-LDM, a unified reconstruction model that simulates the generation behaviors of VAEs, GANs, and diffusion models, thereby broadening the detection scope beyond prior diffusion-only approaches; and (2) a residual bias calculation module that distinguishes real and fake images in a single inference step, which is a significant efficiency improvement over existing methods that typically require 20$+$ steps. Extensive experiments on the benchmark from 10 public datasets demonstrate that R$^2$BD is over 22$\times$ faster than existing reconstruction-based methods while achieving superior detection accuracy. In cross-dataset evaluations, it outperforms state-of-the-art methods by an average of 13.87\%, showing strong efficiency and generalization across diverse generative methods. The code and dataset used for evaluation are available at https://github.com/QingyuLiu/RRBD.
diffusion gan Zhejiang University · Huawei
defense arXiv Jan 10, 2026 · 12w ago
Qingyu Liu, Yitao Zhang, Zhongjie Ba et al. · Zhejiang University
Defends AIGC image copyright by embedding diffusion-trajectory-coupled watermarks robust to 12 real-world spoofing and removal attacks
Output Integrity Attack visiongenerative
Protecting the copyright of user-generated AI images is an emerging challenge as AIGC becomes pervasive in creative workflows. Existing watermarking methods (1) remain vulnerable to real-world adversarial threats, often forced to trade off between defenses against spoofing and removal attacks; and (2) cannot support semantic-level tamper localization. We introduce PAI, a training-free inherent watermarking framework for AIGC copyright protection, plug-and-play with diffusion-based AIGC services. PAI simultaneously provides three key functionalities: robust ownership verification, attack detection, and semantic-level tampering localization. Unlike existing inherent watermark methods that only embed watermarks at noise initialization of diffusion models, we design a novel key-conditioned deflection mechanism that subtly steers the denoising trajectory according to the user key. Such trajectory-level coupling further strengthens the semantic entanglement of identity and content, thereby further enhancing robustness against real-world threats. Moreover, we also provide a theoretical analysis proving that only the valid key can pass verification. Experiments across 12 attack methods show that PAI achieves 98.43\% verification accuracy, improving over SOTA methods by 37.25\% on average, and retains strong tampering localization performance even against advanced AIGC edits. Our code is available at https://github.com/QingyuLiu/PAI.
diffusion Zhejiang University
attack arXiv Dec 10, 2025 · Dec 2025
Xiaojing Chen, Dan Li, Lijun Peng et al.
Proposes frequency-domain black-box adversarial attack that evades AIGC detectors via DCT subspace perturbations under tight query budgets
Input Manipulation Attack Output Integrity Attack vision
The prosperous development of Artificial Intelligence-Generated Content (AIGC) has brought people's anxiety about the spread of false information on social media. Designing detectors for filtering is an effective defense method, but most detectors will be compromised by adversarial samples. Currently, most studies exposing AIGC security issues assume information on model structure and data distribution. In real applications, attackers query and interfere with models that provide services in the form of application programming interfaces (APIs), which constitutes the black-box decision-based attack paradigm. However, to the best of our knowledge, decision-based attacks on AIGC detectors remain unexplored. In this study, we propose \textbf{FBA$^2$D}: a frequency-based black-box attack method for AIGC detection to fill the research gap. Motivated by frequency-domain discrepancies between generated and real images, we develop a decision-based attack that leverages the Discrete Cosine Transform (DCT) for fine-grained spectral partitioning and selects frequency bands as query subspaces, improving both query efficiency and image quality. Moreover, attacks on AIGC detectors should mitigate initialization failures, preserve image quality, and operate under strict query budgets. To address these issues, we adopt an ``adversarial example soup'' method, averaging candidates from successive surrogate iterations and using the result as the initialization to accelerate the query-based attack. The empirical study on the Synthetic LSUN dataset and GenImage dataset demonstrate the effectiveness of our prosed method. This study shows the urgency of addressing practical AIGC security problems.
cnn transformer