defense 2025

Boosting the Robustness-Accuracy Trade-off of SNNs by Robust Temporal Self-Ensemble

Jihang Wang ^1,2, Dongcheng Zhao ^1,3,4, Ruolin Chen ^1,2, Qian Zhang ^1,2,3,4, Yi Zeng ^1,2,2,3,4

¹ Chinese Academy of Sciences

² University of Chinese Academy of Sciences

³ Beijing Key Laboratory of Safe AI and Superalignment

⁴ Center for Long-term AI

0 citations

Published on arXiv

2508.11279

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

RTE consistently outperforms existing adversarial training methods in robust-accuracy trade-off on CIFAR-10 and CIFAR-100, with gains especially pronounced at larger perturbation budgets where standard AT causes severe clean accuracy collapse

RTE (Robust Temporal self-Ensemble)

Novel technique introduced

Spiking Neural Networks (SNNs) offer a promising direction for energy-efficient and brain-inspired computing, yet their vulnerability to adversarial perturbations remains poorly understood. In this work, we revisit the adversarial robustness of SNNs through the lens of temporal ensembling, treating the network as a collection of evolving sub-networks across discrete timesteps. This formulation uncovers two critical but underexplored challenges-the fragility of individual temporal sub-networks and the tendency for adversarial vulnerabilities to transfer across time. To overcome these limitations, we propose Robust Temporal self-Ensemble (RTE), a training framework that improves the robustness of each sub-network while reducing the temporal transferability of adversarial perturbations. RTE integrates both objectives into a unified loss and employs a stochastic sampling strategy for efficient optimization. Extensive experiments across multiple benchmarks demonstrate that RTE consistently outperforms existing training methods in robust-accuracy trade-off. Additional analyses reveal that RTE reshapes the internal robustness landscape of SNNs, leading to more resilient and temporally diversified decision boundaries. Our study highlights the importance of temporal structure in adversarial learning and offers a principled foundation for building robust spiking models.

Key Contributions

Formalizes SNN outputs as a temporal self-ensemble of sub-networks per timestep, revealing fragility of individual sub-networks and adversarial vulnerability transfer across timesteps
Proposes RTE, a training framework with unified loss that jointly strengthens each temporal sub-network's robustness and reduces inter-timestep adversarial transferability via stochastic sampling
Demonstrates consistent improvement in robustness-accuracy trade-off over existing SNN adversarial training baselines on CIFAR-10 and CIFAR-100

🛡️ Threat Analysis

Input Manipulation Attack

Proposes a defense (adversarial training framework RTE) against adversarial input perturbations that cause SNN misclassification at inference time, evaluated against PGD and AutoPGD attacks.

Details

Domains

vision

Model Types

cnn

Threat Tags

white_boxinference_timedigital

Datasets

CIFAR-10CIFAR-100

Applications

image classification

Read PDF arXiv

Boosting the Robustness-Accuracy Trade-off of SNNs by Robust Temporal Self-Ensemble

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Protecting the Neural Networks against FGSM Attack Using Machine Unlearning

MPD-SGR: Robust Spiking Neural Networks with Membrane Potential Distribution-Driven Surrogate Gradient Regularization

DeepProv: Behavioral Characterization and Repair of Neural Networks via Inference Provenance Graph Analysis

Learning Better Certified Models from Empirically-Robust Teachers

IoUCert: Robustness Verification for Anchor-based Object Detectors

S2O: Enhancing Adversarial Training with Second-Order Statistics of Weights

Tight Robustness Certification through the Convex Hull of $\ell_0$ Attacks

Laws of Learning Dynamics and the Core of Learners