Zero-Trust Agentic Federated Learning for Secure IIoT Defense Systems

Recent attacks on critical infrastructure, including the 2021 Oldsmar water treatment breach and 2023 Danish energy sector compromises, highlight urgent security gaps in Industrial IoT (IIoT) deployments. While Federated Learning (FL) enables privacy-preserving collaborative intrusion detection, existing frameworks remain vulnerable to Byzantine poisoning attacks and lack robust agent authentication. We propose Zero-Trust Agentic Federated Learning (ZTA-FL), a defense in depth framework combining: (1) TPM-based cryptographic attestation achieving less than 0.0000001 false acceptance rate, (2) a novel SHAP-weighted aggregation algorithm providing explainable Byzantine detection under non-IID conditions with theoretical guarantees, and (3) privacy-preserving on-device adversarial training. Comprehensive experiments across three IDS benchmarks (Edge-IIoTset, CIC-IDS2017, UNSW-NB15) demonstrate that ZTA-FL achieves 97.8 percent detection accuracy, 93.2 percent accuracy under 30 percent Byzantine attacks (outperforming FLAME by 3.1 percent, p less than 0.01), and 89.3 percent adversarial robustness while reducing communication overhead by 34 percent. We provide theoretical analysis, failure mode characterization, and release code for reproducibility.

Key Contributions

SHAP-weighted Byzantine-fault-tolerant FL aggregation algorithm with theoretical guarantees under non-IID data conditions
TPM-based cryptographic attestation for FL agent authentication achieving <0.0000001 false acceptance rate
Privacy-preserving on-device adversarial training achieving 89.3% adversarial robustness while reducing communication overhead by 34%

🛡️ Threat Analysis

Input Manipulation Attack

Explicitly lists privacy-preserving on-device adversarial training as one of three primary contributions, reporting 89.3% adversarial robustness — a direct defense against inference-time input manipulation attacks on the IDS model.

Data Poisoning Attack

Core novel contribution is a SHAP-weighted Byzantine-fault-tolerant aggregation algorithm that detects and mitigates malicious FL clients sending poisoned model updates — a direct Byzantine poisoning defense with theoretical guarantees under non-IID conditions.