Automatic Attack Discovery for Few-Shot Class-Incremental Learning via Large Language Models

Few-shot class incremental learning (FSCIL) is a more realistic and challenging paradigm in continual learning to incrementally learn unseen classes and overcome catastrophic forgetting on base classes with only a few training examples. Previous efforts have primarily centered around studying more effective FSCIL approaches. By contrast, less attention was devoted to thinking the security issues in contributing to FSCIL. This paper aims to provide a holistic study of the impact of attacks on FSCIL. We first derive insights by systematically exploring how human expert-designed attack methods (i.e., PGD, FGSM) affect FSCIL. We find that those methods either fail to attack base classes, or suffer from huge labor costs due to relying on huge expert knowledge. This highlights the need to craft a specialized attack method for FSCIL. Grounded in these insights, in this paper, we propose a simple yet effective ACraft method to automatically steer and discover optimal attack methods targeted at FSCIL by leveraging Large Language Models (LLMs) without human experts. Moreover, to improve the reasoning between LLMs and FSCIL, we introduce a novel Proximal Policy Optimization (PPO) based reinforcement learning to optimize learning, making LLMs generate better attack methods in the next generation by establishing positive feedback. Experiments on mainstream benchmarks show that our ACraft significantly degrades the performance of state-of-the-art FSCIL methods and dramatically beyond human expert-designed attack methods while maintaining the lowest costs of attack.

Key Contributions

Systematic empirical analysis revealing that standard adversarial attacks (PGD, FGSM) fail to adequately attack FSCIL systems either due to ineffectiveness on base classes or prohibitive expert labor costs
ACraft framework that leverages LLMs to automatically generate optimal attack methods for FSCIL without requiring human expert knowledge
PPO-based reinforcement learning loop that establishes positive feedback between LLM-generated attack candidates and FSCIL model performance, iteratively improving attack quality across generations

🛡️ Threat Analysis

Input Manipulation Attack

ACraft automatically discovers and crafts adversarial attack methods (building on gradient-based attacks like PGD and FGSM) that cause misclassification in FSCIL models at inference time — a direct input manipulation attack contribution.

Details

Domains

visionnlp

Model Types

cnntransformerllmrl

Threat Tags

white_boxinference_timetargeted

Datasets

miniImageNetCIFAR-100CUB-200

Applications

2025 0 cit.

Input Manipulation Attack

69%

Automatic Attack Discovery for Few-Shot Class-Incremental Learning via Large Language Models

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Semantically Guided Adversarial Testing of Vision Models Using Language Models

Generating Adversarial Events: A Motion-Aware Point Cloud Framework

Rerouting LLM Routers

A Small Leak Sinks All: Exploring the Transferable Vulnerability of Source Code Models

T-MLA: A targeted multiscale log-exponential attack framework for neural image compression

Vision Transformers: the threat of realistic adversarial patches

DDSA: Dual-Domain Strategic Attack for Spatial-Temporal Efficiency in Adversarial Robustness Testing

Physical Adversarial Camouflage through Gradient Calibration and Regularization