Noise-Aware Misclassification Attack Detection in Collaborative DNN Inference
Shima Yousefi , Saptarshi Debroy
Published on arXiv
2603.17914
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Achieves up to 90% AUROC for detecting misclassification attacks in collaborative DNN inference under realistic noisy conditions
Noise-Aware VAE Detection Framework
Novel technique introduced
Collaborative inference of object classification Deep neural Networks (DNNs) where resource-constrained end-devices offload partially processed data to remote edge servers to complete end-to-end processing, is becoming a key enabler of edge-AI. However, such edge-offloading is vulnerable to malicious data injections leading to stealthy misclassifications that are tricky to detect, especially in the presence of environmental noise. In this paper, we propose a semi-gray-box and noise- aware anomaly detection framework fueled by a variational autoencoder (VAE) to capture deviations caused by adversarial manipulation. The proposed framework incorporates a robust noise-aware feature that captures the characteristic behavior of environmental noise to improve detection accuracy while reducing false alarm rates. Our evaluation with popular object classification DNNs demonstrate the robustness of the proposed detection (up to 90% AUROC across DNN configurations) under realistic noisy conditions while revealing limitations caused by feature similarity and elevated noise levels.
Key Contributions
- Semi-gray-box VAE-based anomaly detection framework for collaborative DNN inference
- Noise-aware feature extraction that distinguishes environmental noise from adversarial perturbations
- Evaluation demonstrating up to 90% AUROC across DNN configurations under realistic noisy conditions
🛡️ Threat Analysis
The paper addresses adversarial manipulation of intermediate features in collaborative DNN inference to cause misclassification at inference time. The threat model involves attackers perturbing data transmitted between end-devices and edge servers to mislead final predictions. The proposed VAE-based detection framework defends against these inference-time evasion attacks.