Autoencoder-based Denoising Defense against Adversarial Attacks on Object Detection
Min Geun Song , Gang Min Kim , Woonmin Kim , Yongsik Kim , Jeonghyun Sim , Sangbeom Park , Huy Kang Kim
Published on arXiv
2512.16123
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Autoencoder denoising recovers bbox mAP@50 by 10.8% (0.2780→0.3080) after Perlin noise adversarial attacks on YOLOv5, with no model retraining required.
Autoencoder-based Denoising Defense
Novel technique introduced
Deep learning-based object detection models play a critical role in real-world applications such as autonomous driving and security surveillance systems, yet they remain vulnerable to adversarial examples. In this work, we propose an autoencoder-based denoising defense to recover object detection performance degraded by adversarial perturbations. We conduct adversarial attacks using Perlin noise on vehicle-related images from the COCO dataset, apply a single-layer convolutional autoencoder to remove the perturbations, and evaluate detection performance using YOLOv5. Our experiments demonstrate that adversarial attacks reduce bbox mAP from 0.2890 to 0.1640, representing a 43.3% performance degradation. After applying the proposed autoencoder defense, bbox mAP improves to 0.1700 (3.7% recovery) and bbox mAP@50 increases from 0.2780 to 0.3080 (10.8% improvement). These results indicate that autoencoder-based denoising can provide partial defense against adversarial attacks without requiring model retraining.
Key Contributions
- Single-layer convolutional autoencoder used as a plug-in denoising preprocessor for adversarially perturbed images, requiring no model retraining
- Quantitative evaluation showing Perlin-noise adversarial attacks reduce YOLOv5 bbox mAP by 43.3%, with the autoencoder recovering bbox mAP@50 by 10.8%
- Demonstrates autoencoder-based denoising as a model-agnostic defense for real-time object detection in safety-critical applications
🛡️ Threat Analysis
Paper proposes a defense (autoencoder-based input purification/denoising) against adversarial perturbations that degrade inference-time object detection performance — a canonical ML01 input manipulation defense.