benchmark 2026

Comparative Analysis of Patch Attack on VLM-Based Autonomous Driving Architectures

David Fernandez , Pedram MohajerAnsari , Amir Salarpour , Long Cheng , Abolfazl Razi , Mert D. Pesé

0 citations

α

Published on arXiv

2603.08897

Input Manipulation Attack

OWASP ML Top 10 — ML01

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

All three evaluated VLM architectures exhibit severe vulnerabilities to physical adversarial patches, with sustained multi-frame driving failures and critical object detection degradation revealing distinct per-architecture weakness patterns

NES-based adversarial patches with semantic homogenization

Novel technique introduced


Vision-language models are emerging for autonomous driving, yet their robustness to physical adversarial attacks remains unexplored. This paper presents a systematic framework for comparative adversarial evaluation across three VLM architectures: Dolphins, OmniDrive (Omni-L), and LeapVAD. Using black-box optimization with semantic homogenization for fair comparison, we evaluate physically realizable patch attacks in CARLA simulation. Results reveal severe vulnerabilities across all architectures, sustained multi-frame failures, and critical object detection degradation. Our analysis exposes distinct architectural vulnerability patterns, demonstrating that current VLM designs inadequately address adversarial threats in safety-critical autonomous driving applications.


Key Contributions

  • Systematic cross-architecture adversarial evaluation framework for VLM-based autonomous driving systems with rigorous model selection criteria
  • Semantic homogenization layer that projects heterogeneous VLM outputs into a unified embedding space for architecture-agnostic attack comparison
  • Empirical characterization of distinct architectural vulnerability patterns across Dolphins, OmniDrive (Omni-L), and LeapVAD under physically realizable patch attacks in CARLA simulation

🛡️ Threat Analysis

Input Manipulation Attack

Physically realizable adversarial patches crafted via black-box NES optimization to cause misclassification and object detection degradation at inference time in VLM vision encoders.


Details

Domains
visionmultimodalnlp
Model Types
vlmllmtransformer
Threat Tags
black_boxinference_timephysicaltargeted
Datasets
CARLA simulation
Applications
autonomous drivingvlm-based end-to-end drivingpedestrian detectionhighway steering control