Most Convolutional Networks Suffer from Small Adversarial Perturbations

The existence of adversarial examples is relatively understood for random fully connected neural networks, but much less so for convolutional neural networks (CNNs). The recent work [Daniely, 2025] establishes that adversarial examples can be found in CNNs, in some non-optimal distance from the input. We extend over this work and prove that adversarial examples in random CNNs with input dimension $d$ can be found already in $\ell_2$-distance of order $\lVert x \rVert /\sqrt{d}$ from the input $x$, which is essentially the nearest possible. We also show that such adversarial small perturbations can be found using a single step of gradient descent. To derive our results we use Fourier decomposition to efficiently bound the singular values of a random linear convolutional operator, which is the main ingredient of a CNN layer. This bound might be of independent interest.

Key Contributions

Proves adversarial examples in random CNNs exist at ℓ2-distance ||x||/√d, which is essentially optimal and improves over prior non-optimal bounds
Shows a single gradient descent step is sufficient to find such near-optimal adversarial perturbations
Derives tight bounds on singular values of random linear convolutional operators via Fourier decomposition, a technique with independent theoretical interest

🛡️ Threat Analysis

Input Manipulation Attack

The paper proves adversarial examples exist in random CNNs at essentially the smallest possible ℓ2-distance (||x||/√d) and that a single gradient descent step constructs them — a theoretical advance in understanding adversarial input perturbations at inference time.

Details

Domains

vision

Model Types

cnn

Threat Tags

white_boxinference_timeuntargeteddigital

Applications

2025 0 cit.

Input Manipulation Attack

92%