TAIGen: Training-Free Adversarial Image Generation via Diffusion Models

Adversarial attacks from generative models often produce low-quality images and require substantial computational resources. Diffusion models, though capable of high-quality generation, typically need hundreds of sampling steps for adversarial generation. This paper introduces TAIGen, a training-free black-box method for efficient adversarial image generation. TAIGen produces adversarial examples using only 3-20 sampling steps from unconditional diffusion models. Our key finding is that perturbations injected during the mixing step interval achieve comparable attack effectiveness without processing all timesteps. We develop a selective RGB channel strategy that applies attention maps to the red channel while using GradCAM-guided perturbations on green and blue channels. This design preserves image structure while maximizing misclassification in target models. TAIGen maintains visual quality with PSNR above 30 dB across all tested datasets. On ImageNet with VGGNet as source, TAIGen achieves 70.6% success against ResNet, 80.8% against MNASNet, and 97.8% against ShuffleNet. The method generates adversarial examples 10x faster than existing diffusion-based attacks. Our method achieves the lowest robust accuracy, indicating it is the most impactful attack as the defense mechanism is least successful in purifying the images generated by TAIGen.

Key Contributions

Training-free black-box adversarial example generation via unconditional diffusion models requiring only 3–20 sampling steps instead of hundreds
Selective RGB channel strategy applying attention maps to the red channel and GradCAM-guided MI-FGSM perturbations to green and blue channels to preserve image structure while maximizing misclassification
10x speedup over existing diffusion-based adversarial attacks while maintaining PSNR above 30 dB and achieving up to 97.8% attack success rate on ImageNet

🛡️ Threat Analysis

Input Manipulation Attack

TAIGen generates adversarial examples that cause misclassification at inference time using a diffusion model with GradCAM-guided perturbations and MI-FGSM momentum, targeting black-box image classifiers through transferability.

Details

Domains

visiongenerative

Model Types

diffusioncnn

Threat Tags

black_boxinference_timeuntargeteddigital

Datasets

ImageNet

Applications

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

SIDeR: Semantic Identity Decoupling for Unrestricted Face Privacy

Off-The-Shelf Image-to-Image Models Are All You Need To Defeat Image Protection Schemes

CtrlAttack: A Unified Attack on World-Model Control in Diffusion Models

Latent Danger Zone: Distilling Unified Attention for Cross-Architecture Black-box Attacks

DPAC: Distribution-Preserving Adversarial Control for Diffusion Sampling

Arc2Morph: Identity-Preserving Facial Morphing with Arc2Face

ZQBA: Zero Query Black-box Adversarial Attack

AdVAR-DNN: Adversarial Misclassification Attack on Collaborative DNN Inference