defense 2026

A Visual Semantic Adaptive Watermark grounded by Prefix-Tuning for Large Vision-Language Model

Qi Zheng ^1,2, Shuliang Liu ^1,2, Yu Huang ^1,2, Sihang Jia ^1,2, Jungang Li ^1,2, Lyuhao Chen ³, Junhao Chen ^1,2, Hanqian Li ^1,2, Aiwei Liu ^1,2, Yibo Yan ^1,2, Xuming Hu ^1,2

¹ The Hong Kong University of Science and Technology (Guangzhou)

² The Hong Kong University of Science and Technology

³ Zhejiang University

0 citations · 76 references · arXiv

Published on arXiv

2601.07291

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

VISA-Mark achieves 96.88% AUC detection accuracy and 99.3% attack resilience while improving visual consistency by 7.8% (Chair-I) over vision-agnostic watermarking baselines.

VISA-Mark

Novel technique introduced

Watermarking has emerged as a pivotal solution for content traceability and intellectual property protection in Large Vision-Language Models (LVLMs). However, vision-agnostic watermarks introduce visually irrelevant tokens and disrupt visual grounding by enforcing indiscriminate pseudo-random biases, while some semantic-aware methods incur prohibitive inference latency due to rejection sampling. In this paper, we propose the VIsual Semantic Adaptive Watermark (VISA-Mark), a novel framework that embeds detectable signals while strictly preserving visual fidelity. Our approach employs a lightweight, efficiently trained prefix-tuner to extract dynamic Visual-Evidence Weights, which quantify the evidentiary support for candidate tokens based on the visual input. These weights guide an adaptive vocabulary partitioning and logits perturbation mechanism, concentrating watermark strength specifically on visually-supported tokens. By actively aligning the watermark with visual evidence, VISA-Mark effectively maintains visual fidelity. Empirical results confirm that VISA-Mark outperforms conventional methods with a 7.8% improvement in visual consistency (Chair-I) and superior semantic fidelity. The framework maintains highly competitive detection accuracy (96.88% AUC) and robust attack resilience (99.3%) without sacrificing inference efficiency, effectively establishing a new standard for reliability-preserving multimodal watermarking.

Key Contributions

VISA-Mark framework that adaptively concentrates watermark strength on visually-supported tokens using dynamic Visual-Evidence Weights, preserving visual fidelity of LVLM outputs
Lightweight prefix-tuner that extracts token-level visual evidence weights to guide adaptive vocabulary partitioning and logits perturbation without rejection sampling overhead
Demonstrated 7.8% improvement in visual consistency (Chair-I), 96.88% AUC detection accuracy, and 99.3% attack resilience over vision-agnostic baseline watermarking methods

🛡️ Threat Analysis

Output Integrity Attack

VISA-Mark watermarks the TEXT OUTPUTS of LVLMs to enable content traceability and provenance tracking — the watermark is embedded in generated outputs (via logits perturbation at inference time), not in model weights. This is output integrity / AI-generated content attribution, not model IP protection (ML05).

Details

Domains

nlpmultimodal

Model Types

vlmtransformerllm

Threat Tags

inference_time

Datasets

CHAIR-I (evaluation metric for VLM captioning)

Applications

large vision-language modelsai-generated text attributioncontent provenance tracking

Read PDF arXiv DOI

A Visual Semantic Adaptive Watermark grounded by Prefix-Tuning for Large Vision-Language Model

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

AGMark: Attention-Guided Dynamic Watermarking for Large Vision-Language Models

ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models

M4-BLIP: Advancing Multi-Modal Media Manipulation Detection through Face-Enhanced Local Analysis

EDVD-LLaMA: Explainable Deepfake Video Detection via Multimodal Large Language Model Reasoning

PRPO: Paragraph-level Policy Optimization for Vision-Language Deepfake Detection

ForenX: Towards Explainable AI-Generated Image Detection with Multimodal Large Language Models

DMark: Order-Agnostic Watermarking for Diffusion Large Language Models

Semantic Differentiation for Tackling Challenges in Watermarking Low-Entropy Constrained Generation Outputs