attack 2026

Adversarial Patch Generation for Visual-Infrared Dense Prediction Tasks via Joint Position-Color Optimization

He Li 1, Wenyue He 1, Weihang Kong 1, Xingchen Zhang 2

0 citations

α

Published on arXiv

2603.00266

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

AP-PCO achieves consistently strong black-box attack performance across multiple VI dense prediction architectures using a single patch that perturbs both visible and infrared modalities.

AP-PCO

Novel technique introduced


Multimodal adversarial attacks for dense prediction remain largely underexplored. In particular, visual-infrared (VI) perception systems introduce unique challenges due to heterogeneous spectral characteristics and modality-specific intensity distributions. Existing adversarial patch methods are primarily designed for single-modal inputs and fail to account for crossspectral inconsistencies, leading to reduced attack effectiveness and poor stealthiness when applied to VI dense prediction models. To address these challenges, we propose a joint position-color optimization framework (AP-PCO) for generating adversarial patches in visual-infrared settings. The proposed method optimizes patch placement and color composition simultaneously using a fitness function derived from model outputs, enabling a single patch to perturb both visible and infrared modalities. To further bridge spectral discrepancies, we introduce a crossmodal color adaptation strategy that constrains patch appearance according to infrared grayscale characteristics while maintaining strong perturbations in the visible domain, thereby reducing cross-spectral saliency. The optimization procedure operates without requiring internal model information, supporting flexible black-box attacks. Extensive experiments on visual-infrared dense prediction tasks demonstrate that the proposed AP-PCO achieves consistently strong attack performance across multiple architectures, providing a practical benchmark for robustness evaluation in VI perception systems.


Key Contributions

  • Joint position-color optimization framework (AP-PCO) for adversarial patch generation that simultaneously searches patch placement and color composition via a gradient-free fitness function derived from model outputs
  • Cross-modal color adaptation strategy that constrains patch appearance to infrared grayscale characteristics while maintaining visible-domain perturbation strength, reducing cross-spectral saliency
  • First systematic adversarial patch attack targeting visual-infrared dense prediction models, demonstrating transferability across multiple architectures under black-box conditions

🛡️ Threat Analysis

Input Manipulation Attack

Proposes AP-PCO, a gradient-free adversarial patch generation method that crafts inputs to cause misclassification/incorrect outputs at inference time across both visible and infrared modalities in dense prediction models.


Details

Domains
visionmultimodal
Model Types
multimodalcnntransformer
Threat Tags
black_boxinference_timedigital
Applications
dense predictioncrowd countingsemantic segmentationvisual-infrared fusionmultimodal perception systems