AutoDetect: Designing an Autoencoder-based Detection Method for Poisoning Attacks on Object Detection Applications in the Military Domain

Poisoning attacks pose an increasing threat to the security and robustness of Artificial Intelligence systems in the military domain. The widespread use of open-source datasets and pretrained models exacerbates this risk. Despite the severity of this threat, there is limited research on the application and detection of poisoning attacks on object detection systems. This is especially problematic in the military domain, where attacks can have grave consequences. In this work, we both investigate the effect of poisoning attacks on military object detectors in practice, and the best approach to detect these attacks. To support this research, we create a small, custom dataset featuring military vehicles: MilCivVeh. We explore the vulnerability of military object detectors for poisoning attacks by implementing a modified version of the BadDet attack: a patch-based poisoning attack. We then assess its impact, finding that while a positive attack success rate is achievable, it requires a substantial portion of the data to be poisoned -- raising questions about its practical applicability. To address the detection challenge, we test both specialized poisoning detection methods and anomaly detection methods from the visual industrial inspection domain. Since our research shows that both classes of methods are lacking, we introduce our own patch detection method: AutoDetect, a simple, fast, and lightweight autoencoder-based method. Our method shows promising results in separating clean from poisoned samples using the reconstruction error of image slices, outperforming existing methods, while being less time- and memory-intensive. We urge that the availability of large, representative datasets in the military domain is a prerequisite to further evaluate risks of poisoning attacks and opportunities patch detection.

Key Contributions

AutoDetect: a lightweight autoencoder-based method that separates clean from poisoned training samples using reconstruction error of image slices, outperforming existing detection baselines
Empirical evaluation of BadDet (patch-based backdoor) on a custom military vehicle dataset (MilCivVeh), finding high poisoning rates are required for practical attack success
Comparative benchmark of specialized poisoning detection methods and industrial anomaly detection methods on adversarial patch detection, exposing gaps in both classes

🛡️ Threat Analysis

Data Poisoning Attack

The paper explicitly frames the threat as data poisoning (poisoning a fraction of training data) and tests specialized poisoning detection methods; the attack mechanism is training-data corruption, which co-occurs with the backdoor insertion.

Model Poisoning

BadDet is a patch-based backdoor attack embedded via poisoned training data; AutoDetect defends by detecting the trigger patches. The core threat model is trigger-activated hidden behavior — canonical ML10.