TED++: Submanifold-Aware Backdoor Detection via Layerwise Tubular-Neighbourhood Screening

As deep neural networks power increasingly critical applications, stealthy backdoor attacks, where poisoned training inputs trigger malicious model behaviour while appearing benign, pose a severe security risk. Many existing defences are vulnerable when attackers exploit subtle distance-based anomalies or when clean examples are scarce. To meet this challenge, we introduce TED++, a submanifold-aware framework that effectively detects subtle backdoors that evade existing defences. TED++ begins by constructing a tubular neighbourhood around each class's hidden-feature manifold, estimating its local ``thickness'' from a handful of clean activations. It then applies Locally Adaptive Ranking (LAR) to detect any activation that drifts outside the admissible tube. By aggregating these LAR-adjusted ranks across all layers, TED++ captures how faithfully an input remains on the evolving class submanifolds. Based on such characteristic ``tube-constrained'' behaviour, TED++ flags inputs whose LAR-based ranking sequences deviate significantly. Extensive experiments are conducted on benchmark datasets and tasks, demonstrating that TED++ achieves state-of-the-art detection performance under both adaptive-attack and limited-data scenarios. Remarkably, even with only five held-out examples per class, TED++ still delivers near-perfect detection, achieving gains of up to 14\% in AUROC over the next-best method. The code is publicly available at https://github.com/namle-w/TEDpp.

Key Contributions

Tubular-neighbourhood modelling that estimates the 'thickness' of each class's hidden-feature manifold from as few as five clean samples per class
Locally Adaptive Ranking (LAR) that flags activations drifting outside the admissible manifold tube, aggregated across all layers into a trajectory-based detection score
State-of-the-art backdoor detection under both adaptive-attack and limited-data scenarios, achieving up to 14% AUROC gain over the next-best baseline

🛡️ Threat Analysis

Model Poisoning

TED++ is directly a defense against backdoor/trojan attacks — it detects poisoned inputs with hidden trigger patterns by checking whether their layerwise activations remain faithful to the class submanifold, catching trigger-induced activation drift that standard defenses miss.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

training_timeinference_timetargeteddigital

Datasets

CIFAR-10CIFAR-100ImageNetGTSRB

Applications

2026 0 cit.

Model Poisoning

80%

TED++: Submanifold-Aware Backdoor Detection via Layerwise Tubular-Neighbourhood Screening

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

TrojanDec: Data-free Detection of Trojan Inputs in Self-supervised Learning

Kill it with FIRE: On Leveraging Latent Space Directions for Runtime Backdoor Mitigation in Deep Neural Networks

Robust Backdoor Removal by Reconstructing Trigger-Activated Changes in Latent Representation

Isolate Trigger: Detecting and Eliminating Adaptive Backdoor Attacks

NT-ML: Backdoor Defense via Non-target Label Training and Mutual Learning

Backdoor Mitigation via Invertible Pruning Masks

Illuminating the Black Box: Real-Time Monitoring of Backdoor Unlearning in CNNs via Explainable AI

BackdoorIDS: Zero-shot Backdoor Detection for Pretrained Vision Encoder