Projection-based Adversarial Attack using Physics-in-the-Loop Optimization for Monocular Depth Estimation
Takeru Kusakabe , Yudai Hirose , Mashiho Mukaida , Satoshi Ono
Published on arXiv
2512.24792
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
The PITL-based projection attack successfully caused significant depth misestimations in DNN-based MDE models under real-world conditions, making parts of objects disappear from estimated depth maps.
PITL-CMA-ES projection attack
Novel technique introduced
Deep neural networks (DNNs) remain vulnerable to adversarial attacks that cause misclassification when specific perturbations are added to input images. This vulnerability also threatens the reliability of DNN-based monocular depth estimation (MDE) models, making robustness enhancement a critical need in practical applications. To validate the vulnerability of DNN-based MDE models, this study proposes a projection-based adversarial attack method that projects perturbation light onto a target object. The proposed method employs physics-in-the-loop (PITL) optimization -- evaluating candidate solutions in actual environments to account for device specifications and disturbances -- and utilizes a distributed covariance matrix adaptation evolution strategy. Experiments confirmed that the proposed method successfully created adversarial examples that lead to depth misestimations, resulting in parts of objects disappearing from the target scene.
Key Contributions
- Physics-in-the-loop (PITL) optimization that evaluates adversarial candidate solutions in real-world environments, eliminating the need for complex reflectance/lighting simulators
- Non-invasive projection-based physical adversarial attack on MDE models using sep-CMA-ES for high-dimensional black-box optimization
- Demonstrated depth misestimations causing object regions to disappear from target scenes without requiring model parameter access
🛡️ Threat Analysis
Crafts adversarial perturbations (projected light patterns) that cause misclassification/misestimation in DNN-based MDE models at inference time — a physical adversarial example attack using black-box optimization without access to model internals.