attack 2025

FBA$^2$D: Frequency-based Black-box Attack for AI-generated Image Detection

Xiaojing Chen , Dan Li , Lijun Peng , Jun YanŁetter , Zhiqing Guo , Junyang Chen , Xiao Lan , Zhongjie Ba , Yunfeng DiaoŁetter

0 citations · 43 references · arXiv

α

Published on arXiv

2512.09264

Input Manipulation Attack

OWASP ML Top 10 — ML01

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

FBA²D successfully evades AIGC detectors in a strict black-box decision-based setting using DCT frequency subspace queries and adversarial example soup initialization, outperforming prior baselines in query efficiency and image quality on Synthetic LSUN and GenImage.

FBA²D

Novel technique introduced


The prosperous development of Artificial Intelligence-Generated Content (AIGC) has brought people's anxiety about the spread of false information on social media. Designing detectors for filtering is an effective defense method, but most detectors will be compromised by adversarial samples. Currently, most studies exposing AIGC security issues assume information on model structure and data distribution. In real applications, attackers query and interfere with models that provide services in the form of application programming interfaces (APIs), which constitutes the black-box decision-based attack paradigm. However, to the best of our knowledge, decision-based attacks on AIGC detectors remain unexplored. In this study, we propose \textbf{FBA$^2$D}: a frequency-based black-box attack method for AIGC detection to fill the research gap. Motivated by frequency-domain discrepancies between generated and real images, we develop a decision-based attack that leverages the Discrete Cosine Transform (DCT) for fine-grained spectral partitioning and selects frequency bands as query subspaces, improving both query efficiency and image quality. Moreover, attacks on AIGC detectors should mitigate initialization failures, preserve image quality, and operate under strict query budgets. To address these issues, we adopt an ``adversarial example soup'' method, averaging candidates from successive surrogate iterations and using the result as the initialization to accelerate the query-based attack. The empirical study on the Synthetic LSUN dataset and GenImage dataset demonstrate the effectiveness of our prosed method. This study shows the urgency of addressing practical AIGC security problems.


Key Contributions

  • FBA²D: a decision-based black-box adversarial attack using DCT spectral partitioning to select frequency-band query subspaces, improving query efficiency and perceptual quality
  • Adversarial example soup initialization method that averages surrogate iteration candidates to reduce initialization failure and accelerate convergence under strict query budgets
  • First empirical study of decision-based black-box attacks against AIGC detectors, validated on Synthetic LSUN and GenImage datasets

🛡️ Threat Analysis

Input Manipulation Attack

The paper's core contribution is a novel decision-based black-box adversarial attack that crafts perturbations causing AIGC detector classifiers to misclassify AI-generated images as real at inference time. The technique — DCT frequency-band subspace selection and adversarial example soup initialization — is an input manipulation attack against an ML classifier.

Output Integrity Attack

The target of the attack is an AI-generated image detection system, which is a content integrity and provenance verification tool. Defeating the AIGC detector undermines the content authentication pipeline, making this directly relevant to output integrity and AI-generated content detection security.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
black_boxinference_timetargeteddigital
Datasets
Synthetic LSUNGenImage
Applications
ai-generated image detectiondeepfake detectioncontent authentication