Enhancing Adversarial Robustness of IoT Intrusion Detection via SHAP-Based Attribution Fingerprinting

The rapid proliferation of Internet of Things (IoT) devices has transformed numerous industries by enabling seamless connectivity and data-driven automation. However, this expansion has also exposed IoT networks to increasingly sophisticated security threats, including adversarial attacks targeting artificial intelligence (AI) and machine learning (ML)-based intrusion detection systems (IDS) to deliberately evade detection, induce misclassification, and systematically undermine the reliability and integrity of security defenses. To address these challenges, we propose a novel adversarial detection model that enhances the robustness of IoT IDS against adversarial attacks through SHapley Additive exPlanations (SHAP)-based fingerprinting. Using SHAP's DeepExplainer, we extract attribution fingerprints from network traffic features, enabling the IDS to reliably distinguish between clean and adversarially perturbed inputs. By capturing subtle attribution patterns, the model becomes more resilient to evasion attempts and adversarial manipulations. We evaluated the model on a standard IoT benchmark dataset, where it significantly outperformed a state-of-the-art method in detecting adversarial attacks. In addition to enhanced robustness, this approach improves model transparency and interpretability, thereby increasing trust in the IDS through explainable AI.

Key Contributions

Novel unsupervised adversarial detection method using SHAP DeepExplainer attribution fingerprints and an autoencoder trained exclusively on clean samples to flag adversarial inputs via high reconstruction error — no labeled attack data required.
Attribution fingerprinting pipeline that captures attack-specific feature importance rank shifts, revealing distinct behavioral patterns induced by different adversarial attacks.
Demonstrated significant improvement over a state-of-the-art baseline in adversarial detection on an IoT benchmark dataset while also improving model interpretability.

🛡️ Threat Analysis

Input Manipulation Attack

The paper defends against adversarial input manipulation at inference time — attackers craft adversarial perturbations to evade the ML-based IDS. The SHAP-based fingerprinting defense specifically detects adversarially perturbed inputs by identifying shifts in feature attribution patterns, a direct ML01 countermeasure.

Details

Domains

tabular

Model Types

traditional_ml

Threat Tags

inference_timewhite_boxtargeteddigital

Datasets

IoT benchmark dataset (unnamed in available text)

Applications

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Adversarial Robustness in One-Stage Learning-to-Defer

ROAST: Risk-aware Outlier-exposure for Adversarial Selective Training of Anomaly Detectors Against Evasion Attacks

Shapes are not enough: CONSERVAttack and its use for finding vulnerabilities and uncertainties in machine learning applications

Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening

Demystifying the Role of Rule-based Detection in AI Systems for Windows Malware Detection

Certifiably robust malware detectors by design

Uncovering and Understanding FPR Manipulation Attack in Industrial IoT Networks

Identifying Adversary Characteristics from an Observed Attack