Zero-Trust Agentic Federated Learning for Secure IIoT Defense Systems
Samaresh Kumar Singh , Joyjit Roy , Martin So
Published on arXiv
2512.23809
Data Poisoning Attack
OWASP ML Top 10 — ML02
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Achieves 93.2% detection accuracy under 30% Byzantine attacks, outperforming FLAME by 3.1% (p<0.01), with 89.3% adversarial robustness and 34% lower communication overhead.
ZTA-FL (Zero-Trust Agentic Federated Learning)
Novel technique introduced
Recent attacks on critical infrastructure, including the 2021 Oldsmar water treatment breach and 2023 Danish energy sector compromises, highlight urgent security gaps in Industrial IoT (IIoT) deployments. While Federated Learning (FL) enables privacy-preserving collaborative intrusion detection, existing frameworks remain vulnerable to Byzantine poisoning attacks and lack robust agent authentication. We propose Zero-Trust Agentic Federated Learning (ZTA-FL), a defense in depth framework combining: (1) TPM-based cryptographic attestation achieving less than 0.0000001 false acceptance rate, (2) a novel SHAP-weighted aggregation algorithm providing explainable Byzantine detection under non-IID conditions with theoretical guarantees, and (3) privacy-preserving on-device adversarial training. Comprehensive experiments across three IDS benchmarks (Edge-IIoTset, CIC-IDS2017, UNSW-NB15) demonstrate that ZTA-FL achieves 97.8 percent detection accuracy, 93.2 percent accuracy under 30 percent Byzantine attacks (outperforming FLAME by 3.1 percent, p less than 0.01), and 89.3 percent adversarial robustness while reducing communication overhead by 34 percent. We provide theoretical analysis, failure mode characterization, and release code for reproducibility.
Key Contributions
- SHAP-weighted Byzantine-fault-tolerant FL aggregation algorithm with theoretical guarantees under non-IID data conditions
- TPM-based cryptographic attestation for FL agent authentication achieving <0.0000001 false acceptance rate
- Privacy-preserving on-device adversarial training achieving 89.3% adversarial robustness while reducing communication overhead by 34%
🛡️ Threat Analysis
Explicitly lists privacy-preserving on-device adversarial training as one of three primary contributions, reporting 89.3% adversarial robustness — a direct defense against inference-time input manipulation attacks on the IDS model.
Core novel contribution is a SHAP-weighted Byzantine-fault-tolerant aggregation algorithm that detects and mitigates malicious FL clients sending poisoned model updates — a direct Byzantine poisoning defense with theoretical guarantees under non-IID conditions.