defense 2025

Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles

Dong Lao 1,2, Yuxiang Zhang 2, Haniyeh Ehsani Oskouie 2, Yangchao Wu 2, Alex Wong 3, Stefano Soatto 2

0 citations · 66 references · arXiv

α

Published on arXiv

2510.03224

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

A training-free latent ensembling defense recovers up to 71.9% of adversarial accuracy loss on stereo matching and 68.1% on image classification across multiple attack types including adaptive attacks.

Stochastic Resonance of Latent Ensembles (SRLE)

Novel technique introduced


We propose a test-time defense mechanism against adversarial attacks: imperceptible image perturbations that significantly alter the predictions of a model. Unlike existing methods that rely on feature filtering or smoothing, which can lead to information loss, we propose to "combat noise with noise" by leveraging stochastic resonance to enhance robustness while minimizing information loss. Our approach introduces small translational perturbations to the input image, aligns the transformed feature embeddings, and aggregates them before mapping back to the original reference image. This can be expressed in a closed-form formula, which can be deployed on diverse existing network architectures without introducing additional network modules or fine-tuning for specific attack types. The resulting method is entirely training-free, architecture-agnostic, and attack-agnostic. Empirical results show state-of-the-art robustness on image classification and, for the first time, establish a generic test-time defense for dense prediction tasks, including stereo matching and optical flow, highlighting the method's versatility and practicality. Specifically, relative to clean (unperturbed) performance, our method recovers up to 68.1% of the accuracy loss on image classification, 71.9% on stereo matching, and 29.2% on optical flow under various types of adversarial attacks.


Key Contributions

  • Training-free, architecture-agnostic test-time defense via stochastic resonance: small translational perturbations are applied to inputs, embeddings are aligned and aggregated in latent space, expressed in a closed-form formula with no additional modules or fine-tuning
  • First generic test-time adversarial defense demonstrated on dense prediction tasks (stereo matching and optical flow) in addition to image classification
  • Recovers up to 68.1% of accuracy loss on image classification, 71.9% on stereo matching, and 29.2% on optical flow across diverse attack types including adaptive attacks

🛡️ Threat Analysis

Input Manipulation Attack

The paper directly proposes and evaluates a defense against adversarial examples — gradient-based imperceptible input perturbations that cause misclassification or wrong dense predictions at inference time. The threat model is canonical ML01: FGSM, PGD, universal perturbations, and adaptive attacks targeting image classifiers and dense prediction networks.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
white_boxblack_boxinference_timedigitaluntargeted
Applications
image classificationstereo matchingoptical flow