attack 2025

FBA$^2$D: Frequency-based Black-box Attack for AI-generated Image Detection

Xiaojing Chen , Dan Li , Lijun Peng , Jun YanŁetter , Zhiqing Guo , Junyang Chen , Xiao Lan , Zhongjie Ba , Yunfeng DiaoŁetter

0 citations · 43 references · arXiv

Published on arXiv

2512.09264

Input Manipulation Attack

OWASP ML Top 10 — ML01

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

FBA²D successfully evades AIGC detectors in a strict black-box decision-based setting using DCT frequency subspace queries and adversarial example soup initialization, outperforming prior baselines in query efficiency and image quality on Synthetic LSUN and GenImage.

FBA²D

Novel technique introduced

The prosperous development of Artificial Intelligence-Generated Content (AIGC) has brought people's anxiety about the spread of false information on social media. Designing detectors for filtering is an effective defense method, but most detectors will be compromised by adversarial samples. Currently, most studies exposing AIGC security issues assume information on model structure and data distribution. In real applications, attackers query and interfere with models that provide services in the form of application programming interfaces (APIs), which constitutes the black-box decision-based attack paradigm. However, to the best of our knowledge, decision-based attacks on AIGC detectors remain unexplored. In this study, we propose \textbf{FBA$^2$D}: a frequency-based black-box attack method for AIGC detection to fill the research gap. Motivated by frequency-domain discrepancies between generated and real images, we develop a decision-based attack that leverages the Discrete Cosine Transform (DCT) for fine-grained spectral partitioning and selects frequency bands as query subspaces, improving both query efficiency and image quality. Moreover, attacks on AIGC detectors should mitigate initialization failures, preserve image quality, and operate under strict query budgets. To address these issues, we adopt an ``adversarial example soup'' method, averaging candidates from successive surrogate iterations and using the result as the initialization to accelerate the query-based attack. The empirical study on the Synthetic LSUN dataset and GenImage dataset demonstrate the effectiveness of our prosed method. This study shows the urgency of addressing practical AIGC security problems.

Key Contributions

FBA²D: a decision-based black-box adversarial attack using DCT spectral partitioning to select frequency-band query subspaces, improving query efficiency and perceptual quality
Adversarial example soup initialization method that averages surrogate iteration candidates to reduce initialization failure and accelerate convergence under strict query budgets
First empirical study of decision-based black-box attacks against AIGC detectors, validated on Synthetic LSUN and GenImage datasets

🛡️ Threat Analysis

Input Manipulation Attack

The paper's core contribution is a novel decision-based black-box adversarial attack that crafts perturbations causing AIGC detector classifiers to misclassify AI-generated images as real at inference time. The technique — DCT frequency-band subspace selection and adversarial example soup initialization — is an input manipulation attack against an ML classifier.

Output Integrity Attack

The target of the attack is an AI-generated image detection system, which is a content integrity and provenance verification tool. Defeating the AIGC detector undermines the content authentication pipeline, making this directly relevant to output integrity and AI-generated content detection security.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

black_boxinference_timetargeteddigital

Datasets

Synthetic LSUNGenImage

Applications

ai-generated image detectiondeepfake detectioncontent authentication

Read PDF arXiv DOI

FBA$^2$D: Frequency-based Black-box Attack for AI-generated Image Detection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

AegisRF: Adversarial Perturbations Guided with Sensitivity for Protecting Intellectual Property of Neural Radiance Fields

Everywhere Attack: Attacking Locally and Globally to Boost Targeted Transferability

Out-of-the-box: Black-box Causal Attacks on Object Detectors

eXIAA: eXplainable Injections for Adversarial Attack

MS-GAGA: Metric-Selective Guided Adversarial Generation Attack

SEGA: A Transferable Signed Ensemble Gaussian Black-Box Attack against No-Reference Image Quality Assessment Models

SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling

Contract And Conquer: How to Provably Compute Adversarial Examples for a Black-Box Model?