attack 2025

Adversarial Robustness in Zero-Shot Learning:An Empirical Study on Class and Concept-Level Vulnerabilities

0 citations · 72 references · arXiv

Published on arXiv

2512.18651

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

CBEA completely eliminates GZSL accuracy across all calibration points, exposing the spurious nature of prior class-level attacks, while concept-level attacks further reveal that ZSL models are vulnerable to semantic concept manipulation.

CBEA (Class-Bias Enhanced Attack) / CPconA / NCPconA

Novel technique introduced

Zero-shot Learning (ZSL) aims to enable image classifiers to recognize images from unseen classes that were not included during training. Unlike traditional supervised classification, ZSL typically relies on learning a mapping from visual features to predefined, human-understandable class concepts. While ZSL models promise to improve generalization and interpretability, their robustness under systematic input perturbations remain unclear. In this study, we present an empirical analysis about the robustness of existing ZSL methods at both classlevel and concept-level. Specifically, we successfully disrupted their class prediction by the well-known non-target class attack (clsA). However, in the Generalized Zero-shot Learning (GZSL) setting, we observe that the success of clsA is only at the original best-calibrated point. After the attack, the optimal bestcalibration point shifts, and ZSL models maintain relatively strong performance at other calibration points, indicating that clsA results in a spurious attack success in the GZSL. To address this, we propose the Class-Bias Enhanced Attack (CBEA), which completely eliminates GZSL accuracy across all calibrated points by enhancing the gap between seen and unseen class probabilities.Next, at concept-level attack, we introduce two novel attack modes: Class-Preserving Concept Attack (CPconA) and NonClass-Preserving Concept Attack (NCPconA). Our extensive experiments evaluate three typical ZSL models across various architectures from the past three years and reveal that ZSL models are vulnerable not only to the traditional class attack but also to concept-based attacks. These attacks allow malicious actors to easily manipulate class predictions by erasing or introducing concepts. Our findings highlight a significant performance gap between existing approaches, emphasizing the need for improved adversarial robustness in current ZSL models.

Key Contributions

Identifies spurious attack success in GZSL settings: standard non-targeted class attacks shift the optimal calibration point rather than truly defeating the model across all calibration points.
Proposes Class-Bias Enhanced Attack (CBEA) that completely eliminates GZSL accuracy across all calibration points by amplifying the seen/unseen class probability gap.
Introduces two novel concept-level attack modes — Class-Preserving Concept Attack (CPconA) and Non-Class-Preserving Concept Attack (NCPconA) — that manipulate ZSL predictions by erasing or injecting semantic concepts.

🛡️ Threat Analysis

Input Manipulation Attack

Paper proposes adversarial input perturbation attacks (CBEA, CPconA, NCPconA) that cause misclassification in ZSL/GZSL image classifiers at inference time, both at class-level and concept-level — core adversarial example attack research.

Details

Domains

vision

Model Types

cnntransformergnn

Threat Tags

white_boxinference_timetargeteduntargeteddigital

Datasets

AWA2CUBSUN

Applications

zero-shot image classificationgeneralized zero-shot learning

Read PDF arXiv DOI

Adversarial Robustness in Zero-Shot Learning:An Empirical Study on Class and Concept-Level Vulnerabilities

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Sequential Difference Maximization: Generating Adversarial Examples via Multi-Stage Optimization

Adversarial Attacks on Medical Hyperspectral Imaging Exploiting Spectral-Spatial Dependencies and Multiscale Features

Adversarial Attention Perturbations for Large Object Detection Transformers

DDSA: Dual-Domain Strategic Attack for Spatial-Temporal Efficiency in Adversarial Robustness Testing

Less Is More: Sparse and Cooperative Perturbation for Point Cloud Attacks

Generating Adversarial Events: A Motion-Aware Point Cloud Framework

Learning Fourier shapes to probe the geometric world of deep neural networks

RefSR-Adv: Adversarial Attack on Reference-based Image Super-Resolution Models