Contrastive Spectral Rectification: Test-Time Defense towards Zero-shot Adversarial Robustness of CLIP

Vision-language models (VLMs) such as CLIP have demonstrated remarkable zero-shot generalization, yet remain highly vulnerable to adversarial examples (AEs). While test-time defenses are promising, existing methods fail to provide sufficient robustness against strong attacks and are often hampered by high inference latency and task-specific applicability. To address these limitations, we start by investigating the intrinsic properties of AEs, which reveals that AEs exhibit severe feature inconsistency under progressive frequency attenuation. We further attribute this to the model's inherent spectral bias. Leveraging this insight, we propose an efficient test-time defense named Contrastive Spectral Rectification (CSR). CSR optimizes a rectification perturbation to realign the input with the natural manifold under a spectral-guided contrastive objective, which is applied input-adaptively. Extensive experiments across 16 classification benchmarks demonstrate that CSR outperforms the SOTA by an average of 18.1% against strong AutoAttack with modest inference overhead. Furthermore, CSR exhibits broad applicability across diverse visual tasks. Code is available at https://github.com/Summu77/CSR.

Key Contributions

Empirical insight that adversarial examples exhibit severe feature inconsistency under progressive frequency attenuation, attributed to CLIP's inherent spectral bias
CSR: an input-adaptive test-time defense that optimizes a rectification perturbation to realign adversarial inputs to the natural manifold using a spectral-guided contrastive objective
Demonstrates 18.1% average improvement over SOTA against AutoAttack across 16 classification benchmarks with modest inference overhead and broad task applicability

🛡️ Threat Analysis

Input Manipulation Attack

Directly defends against adversarial examples (input manipulation attacks) targeting CLIP at inference time; evaluated against strong attacks including AutoAttack; proposes a novel input purification strategy via spectral rectification to restore natural manifold alignment.

Details

Domains

visionmultimodal

Model Types

vlmtransformer

Threat Tags

white_boxinference_timedigital

Datasets

ImageNet16 classification benchmarks

Applications

2025 0 cit.

Input Manipulation Attack

92%