TAIGen: Training-Free Adversarial Image Generation via Diffusion Models
Susim Roy 1,2, Anubhooti Jain 2, Mayank Vatsa 2, Richa Singh 2
Published on arXiv
2508.15020
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Achieves up to 97.8% attack success rate against ShuffleNet (with VGGNet as source) on ImageNet while generating adversarial examples 10x faster than existing diffusion-based attacks.
TAIGen
Novel technique introduced
Adversarial attacks from generative models often produce low-quality images and require substantial computational resources. Diffusion models, though capable of high-quality generation, typically need hundreds of sampling steps for adversarial generation. This paper introduces TAIGen, a training-free black-box method for efficient adversarial image generation. TAIGen produces adversarial examples using only 3-20 sampling steps from unconditional diffusion models. Our key finding is that perturbations injected during the mixing step interval achieve comparable attack effectiveness without processing all timesteps. We develop a selective RGB channel strategy that applies attention maps to the red channel while using GradCAM-guided perturbations on green and blue channels. This design preserves image structure while maximizing misclassification in target models. TAIGen maintains visual quality with PSNR above 30 dB across all tested datasets. On ImageNet with VGGNet as source, TAIGen achieves 70.6% success against ResNet, 80.8% against MNASNet, and 97.8% against ShuffleNet. The method generates adversarial examples 10x faster than existing diffusion-based attacks. Our method achieves the lowest robust accuracy, indicating it is the most impactful attack as the defense mechanism is least successful in purifying the images generated by TAIGen.
Key Contributions
- Training-free black-box adversarial example generation via unconditional diffusion models requiring only 3–20 sampling steps instead of hundreds
- Selective RGB channel strategy applying attention maps to the red channel and GradCAM-guided MI-FGSM perturbations to green and blue channels to preserve image structure while maximizing misclassification
- 10x speedup over existing diffusion-based adversarial attacks while maintaining PSNR above 30 dB and achieving up to 97.8% attack success rate on ImageNet
🛡️ Threat Analysis
TAIGen generates adversarial examples that cause misclassification at inference time using a diffusion model with GradCAM-guided perturbations and MI-FGSM momentum, targeting black-box image classifiers through transferability.