Overcoming Black-box Attack Inefficiency with Hybrid and Dynamic Select Algorithms

Adversarial text attack research plays a crucial role in evaluating the robustness of NLP models. However, the increasing complexity of transformer-based architectures has dramatically raised the computational cost of attack testing, especially for researchers with limited resources (e.g., GPUs). Existing popular black-box attack methods often require a large number of queries, which can make them inefficient and impractical for researchers. To address these challenges, we propose two new attack selection strategies called Hybrid and Dynamic Select, which better combine the strengths of previous selection algorithms. Hybrid Select merges generalized BinarySelect techniques with GreedySelect by introducing a size threshold to decide which selection algorithm to use. Dynamic Select provides an alternative approach of combining the generalized Binary and GreedySelect by learning which lengths of texts each selection method should be applied to. This greatly reduces the number of queries needed while maintaining attack effectiveness (a limitation of BinarySelect). Across 4 datasets and 6 target models, our best method(sentence-level Hybrid Select) is able to reduce the number of required queries per attack up 25.82\% on average against both encoder models and LLMs, without losing the effectiveness of the attack.

Key Contributions

Hybrid Select: combines BinarySelect and GreedySelect via a size threshold, choosing the appropriate algorithm based on input length to reduce query count
Dynamic Select: learns which text lengths each selection method excels at, further optimizing query efficiency adaptively
Achieves up to 25.82% reduction in queries per attack averaged across 4 datasets and 6 models (encoder models + LLMs) without degrading attack success rate

🛡️ Threat Analysis

Input Manipulation Attack

Paper proposes adversarial text attack strategies (word/sentence substitution) that cause NLP classifiers to misclassify at inference time — classic evasion attacks on ML models. The contribution is attack-selection efficiency, but the attack goal is misclassification via crafted adversarial inputs.

Details

Domains

nlp

Model Types

transformerllm

Threat Tags

black_boxinference_timeuntargeted

Datasets

IMDbYelpAGNews

Applications

2025 0 cit.

Input Manipulation Attack

81%