attack 2025

DPAC: Distribution-Preserving Adversarial Control for Diffusion Sampling

Han-Jin Lee , Han-Ju Lee , Jin-Seong Kim , Seok-Hwan Choi

0 citations · 29 references · arXiv

α

Published on arXiv

2512.01153

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

DPAC achieves lower FID and path-KL at matched attack success rates compared to standard adversarially guided diffusion on ImageNet-100, with the tangent projection reducing the Wasserstein quality gap from O(Δt) to O(Δt²)

DPAC (Distribution-Preserving Adversarial Control)

Novel technique introduced


Adversarially guided diffusion sampling often achieves the target class, but sample quality degrades as deviations between the adversarially controlled and nominal trajectories accumulate. We formalize this degradation as a path-space Kullback-Leibler divergence(path-KL) between controlled and nominal (uncontrolled) diffusion processes, thereby showing via Girsanov's theorem that it exactly equals the control energy. Building on this stochastic optimal control (SOC) view, we theoretically establish that minimizing this path-KL simultaneously tightens upper bounds on both the 2-Wasserstein distance and Fréchet Inception Distance (FID), revealing a principled connection between adversarial control energy and perceptual fidelity. From a variational perspective, we derive a first-order optimality condition for the control: among all directions that yield the same classification gain, the component tangent to iso-(log-)density surfaces (i.e., orthogonal to the score) minimizes path-KL, whereas the normal component directly increases distributional drift. This leads to DPAC (Distribution-Preserving Adversarial Control), a diffusion guidance rule that projects adversarial gradients onto the tangent space defined by the generative score geometry. We further show that in discrete solvers, the tangent projection cancels the O(Δt) leading error term in the Wasserstein distance, achieving an O(Δt^2) quality gap; moreover, it remains second-order robust to score or metric approximation. Empirical studies on ImageNet-100 validate the theoretical predictions, confirming that DPAC achieves lower FID and estimated path-KL at matched attack success rates.


Key Contributions

  • Formalizes quality degradation in adversarially guided diffusion as path-space KL divergence (path-KL), showing via Girsanov's theorem it equals control energy and bounds FID and 2-Wasserstein distance
  • Derives a first-order optimality condition showing that projecting adversarial gradients onto the tangent space (iso-density surfaces) of the generative score minimizes path-KL while preserving classification gain
  • Proposes DPAC, a diffusion guidance rule implementing this projection, achieving O(Δt²) quality gap versus O(Δt) for unprojected guidance, validated on ImageNet-100

🛡️ Threat Analysis

Input Manipulation Attack

DPAC is a method for generating adversarial examples via classifier-guided diffusion sampling — the adversarial control steers diffusion trajectories toward a target class (attack success rate is the primary evaluation metric). The paper's core contribution is improving the perceptual fidelity of these adversarial inputs without sacrificing attack success, directly advancing adversarial example generation methodology.


Details

Domains
visiongenerative
Model Types
diffusioncnn
Threat Tags
inference_timetargeteddigital
Datasets
ImageNet-100
Applications
image classificationadversarial example generation