attack 2026

Most Convolutional Networks Suffer from Small Adversarial Perturbations

Amit Daniely , Idan Mehalel

0 citations · 22 references · arXiv (Cornell University)

α

Published on arXiv

2602.03415

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Adversarial examples in random CNNs with input dimension d can be constructed at ℓ2-distance ||x||/√d — essentially the smallest possible — using a single gradient descent step.


The existence of adversarial examples is relatively understood for random fully connected neural networks, but much less so for convolutional neural networks (CNNs). The recent work [Daniely, 2025] establishes that adversarial examples can be found in CNNs, in some non-optimal distance from the input. We extend over this work and prove that adversarial examples in random CNNs with input dimension $d$ can be found already in $\ell_2$-distance of order $\lVert x \rVert /\sqrt{d}$ from the input $x$, which is essentially the nearest possible. We also show that such adversarial small perturbations can be found using a single step of gradient descent. To derive our results we use Fourier decomposition to efficiently bound the singular values of a random linear convolutional operator, which is the main ingredient of a CNN layer. This bound might be of independent interest.


Key Contributions

  • Proves adversarial examples in random CNNs exist at ℓ2-distance ||x||/√d, which is essentially optimal and improves over prior non-optimal bounds
  • Shows a single gradient descent step is sufficient to find such near-optimal adversarial perturbations
  • Derives tight bounds on singular values of random linear convolutional operators via Fourier decomposition, a technique with independent theoretical interest

🛡️ Threat Analysis

Input Manipulation Attack

The paper proves adversarial examples exist in random CNNs at essentially the smallest possible ℓ2-distance (||x||/√d) and that a single gradient descent step constructs them — a theoretical advance in understanding adversarial input perturbations at inference time.


Details

Domains
vision
Model Types
cnn
Threat Tags
white_boxinference_timeuntargeteddigital
Applications
image classification