attack 2025

Hidden Tail: Adversarial Image Causing Stealthy Resource Consumption in Vision-Language Models

Rui Zhang ¹, Zihan Wang ¹, Tianli Yang ¹, Hongwei Li ¹, Wenbo Jiang ¹, Qingchuan Zhao ², Yang Liu ³, Guowen Xu ¹

¹ University of Electronic Science and Technology of China

² City University of Hong Kong

³ Nanyang Technological University

0 citations

Published on arXiv

2508.18805

Input Manipulation Attack

OWASP ML Top 10 — ML01

Model Denial of Service

OWASP LLM Top 10 — LLM04

Key Finding

Hidden Tail increases VLM output length by up to 19.2x, reaching the maximum token limit, while preserving attack stealthiness with user-visible outputs indistinguishable from clean image responses.

Hidden Tail

Novel technique introduced

Vision-Language Models (VLMs) are increasingly deployed in real-world applications, but their high inference cost makes them vulnerable to resource consumption attacks. Prior attacks attempt to extend VLM output sequences by optimizing adversarial images, thereby increasing inference costs. However, these extended outputs often introduce irrelevant abnormal content, compromising attack stealthiness. This trade-off between effectiveness and stealthiness poses a major limitation for existing attacks. To address this challenge, we propose \textit{Hidden Tail}, a stealthy resource consumption attack that crafts prompt-agnostic adversarial images, inducing VLMs to generate maximum-length outputs by appending special tokens invisible to users. Our method employs a composite loss function that balances semantic preservation, repetitive special token induction, and suppression of the end-of-sequence (EOS) token, optimized via a dynamic weighting strategy. Extensive experiments show that \textit{Hidden Tail} outperforms existing attacks, increasing output length by up to 19.2$\times$ and reaching the maximum token limit, while preserving attack stealthiness. These results highlight the urgent need to improve the robustness of VLMs against efficiency-oriented adversarial threats. Our code is available at https://github.com/zhangrui4041/Hidden_Tail.

Key Contributions

Hidden Tail attack: prompt-agnostic adversarial images that induce VLMs to generate maximum-length outputs via user-invisible special tokens appended as a hidden tail
Composite loss function balancing semantic consistency, hidden tail induction, and EOS suppression with dynamic weighting strategy
Demonstrates 19.2x output length increase reaching maximum token limit while maintaining user-visible output quality comparable to clean images

🛡️ Threat Analysis

Input Manipulation Attack

Crafts adversarial images via gradient-based optimization (composite loss with EOS suppression, hidden tail induction) that manipulate VLM inference-time behavior — this is a visual adversarial input attack on a multimodal model.

Details

Domains

visionmultimodalnlp

Model Types

vlmllmtransformer

Threat Tags

white_boxinference_timetargeteddigital

Datasets

diverse prompt-response datasets constructed per image

Applications

vision-language modelsmultimodal rag pipelinesvlm-based web search tools

Read PDF arXiv Code

Hidden Tail: Adversarial Image Causing Stealthy Resource Consumption in Vision-Language Models

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

An Image Is Worth Ten Thousand Words: Verbose-Text Induction Attacks on VLMs

VidDoS: Universal Denial-of-Service Attack on Video-based Large Language Models

ARMOR: Agentic Reasoning for Methods Orchestration and Reparameterization for Robust Adversarial Attacks

TRAP: Hijacking VLA CoT-Reasoning via Adversarial Patches

A Two-Stage Globally-Diverse Adversarial Attack for Vision-Language Pre-training Models

Enhancing Adversarial Transferability in Visual-Language Pre-training Models via Local Shuffle and Sample-based Attack

Toward Universal and Transferable Jailbreak Attacks on Vision-Language Models

On the Adversarial Robustness of 3D Large Vision-Language Models