defense 2025

Unsupervised Backdoor Detection and Mitigation for Spiking Neural Networks

Jiachen Li , Bang Wu , Xiaoyu Xia , Xiaoning Liu , Xun Yi , Xiuzhen Zhang

0 citations · 52 references · RAID

α

Published on arXiv

2510.06629

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

TMPBD achieves 100% backdoor target label detection accuracy across all benchmarks; NDSBM reduces attack success rate from 100% to 2.81% when combined with detection, without degrading clean accuracy.

TMPBD / NDSBM

Novel technique introduced


Spiking Neural Networks (SNNs) have gained increasing attention for their superior energy efficiency compared to Artificial Neural Networks (ANNs). However, their security aspects, particularly under backdoor attacks, have received limited attention. Existing defense methods developed for ANNs perform poorly or can be easily bypassed in SNNs due to their event-driven and temporal dependencies. This paper identifies the key blockers that hinder traditional backdoor defenses in SNNs and proposes an unsupervised post-training detection framework, Temporal Membrane Potential Backdoor Detection (TMPBD), to overcome these challenges. TMPBD leverages the maximum margin statistics of temporal membrane potential (TMP) in the final spiking layer to detect target labels without any attack knowledge or data access. We further introduce a robust mitigation mechanism, Neural Dendrites Suppression Backdoor Mitigation (NDSBM), which clamps dendritic connections between early convolutional layers to suppress malicious neurons while preserving benign behaviors, guided by TMP extracted from a small, clean, unlabeled dataset. Extensive experiments on multiple neuromorphic benchmarks and state-of-the-art input-aware dynamic trigger attacks demonstrate that TMPBD achieves 100% detection accuracy, while NDSBM reduces the attack success rate from 100% to 8.44%, and to 2.81% when combined with detection, without degrading clean accuracy.


Key Contributions

  • TMPBD: unsupervised post-training backdoor detection using maximum margin statistics of temporal membrane potential in the final spiking layer — achieves 100% detection accuracy without attack knowledge or data access
  • NDSBM: mitigation mechanism that clamps dendritic weights between early convolutional layers to suppress malicious neurons, guided by TMP from a small clean unlabeled dataset
  • First comprehensive backdoor defense framework dedicated to SNNs, identifying and addressing fundamental blockers that prevent ANN defenses from transferring to the SNN setting

🛡️ Threat Analysis

Model Poisoning

Paper proposes two dedicated defense mechanisms (TMPBD and NDSBM) to detect and mitigate backdoor/trojan attacks in SNNs — TMPBD detects target labels of hidden backdoor triggers, while NDSBM suppresses malicious neurons to neutralize embedded backdoor behavior, reducing ASR from 100% to 2.81%.


Details

Domains
vision
Model Types
cnn
Threat Tags
training_timetargeted
Datasets
N-MNISTN-CALTECH101CIFAR10-DVS
Applications
neuromorphic event-based classificationimage classificationautonomous driving