DPAC: Distribution-Preserving Adversarial Control for Diffusion Sampling

Adversarially guided diffusion sampling often achieves the target class, but sample quality degrades as deviations between the adversarially controlled and nominal trajectories accumulate. We formalize this degradation as a path-space Kullback-Leibler divergence(path-KL) between controlled and nominal (uncontrolled) diffusion processes, thereby showing via Girsanov's theorem that it exactly equals the control energy. Building on this stochastic optimal control (SOC) view, we theoretically establish that minimizing this path-KL simultaneously tightens upper bounds on both the 2-Wasserstein distance and Fréchet Inception Distance (FID), revealing a principled connection between adversarial control energy and perceptual fidelity. From a variational perspective, we derive a first-order optimality condition for the control: among all directions that yield the same classification gain, the component tangent to iso-(log-)density surfaces (i.e., orthogonal to the score) minimizes path-KL, whereas the normal component directly increases distributional drift. This leads to DPAC (Distribution-Preserving Adversarial Control), a diffusion guidance rule that projects adversarial gradients onto the tangent space defined by the generative score geometry. We further show that in discrete solvers, the tangent projection cancels the O(Δt) leading error term in the Wasserstein distance, achieving an O(Δt^2) quality gap; moreover, it remains second-order robust to score or metric approximation. Empirical studies on ImageNet-100 validate the theoretical predictions, confirming that DPAC achieves lower FID and estimated path-KL at matched attack success rates.

Key Contributions

Formalizes quality degradation in adversarially guided diffusion as path-space KL divergence (path-KL), showing via Girsanov's theorem it equals control energy and bounds FID and 2-Wasserstein distance
Derives a first-order optimality condition showing that projecting adversarial gradients onto the tangent space (iso-density surfaces) of the generative score minimizes path-KL while preserving classification gain
Proposes DPAC, a diffusion guidance rule implementing this projection, achieving O(Δt²) quality gap versus O(Δt) for unprojected guidance, validated on ImageNet-100

🛡️ Threat Analysis

Input Manipulation Attack

DPAC is a method for generating adversarial examples via classifier-guided diffusion sampling — the adversarial control steers diffusion trajectories toward a target class (attack success rate is the primary evaluation metric). The paper's core contribution is improving the perceptual fidelity of these adversarial inputs without sacrificing attack success, directly advancing adversarial example generation methodology.

Details

Domains

visiongenerative

Model Types

diffusioncnn

Threat Tags

inference_timetargeteddigital

Datasets

ImageNet-100

Applications

2026 0 cit.

Input Manipulation Attack

79%

DPAC: Distribution-Preserving Adversarial Control for Diffusion Sampling

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Arc2Morph: Identity-Preserving Facial Morphing with Arc2Face

The Illusion of Forgetting: Attack Unlearned Diffusion via Initial Latent Variable Optimization

When World Models Dream Wrong: Physical-Conditioned Adversarial Attacks against World Models

Immunizing Images from Text to Image Editing via Adversarial Cross-Attention

Machine Pareidolia: Protecting Facial Image with Emotional Editing

TAIGen: Training-Free Adversarial Image Generation via Diffusion Models

Erased, But Not Forgotten: Erased Rectified Flow Transformers Still Remain Unsafe Under Concept Attack

Off-The-Shelf Image-to-Image Models Are All You Need To Defeat Image Protection Schemes