attack 2026

ChartAttack: Testing the Vulnerability of LLMs to Malicious Prompting in Chart Generation

0 citations · 55 references · arXiv

Published on arXiv

2601.12983

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

ChartAttack reduces MLLM chart-QA accuracy by an average of 19.6 percentage points in-domain and 14.9 cross-domain, and degrades human accuracy by 20.2 points.

ChartAttack

Novel technique introduced

Multimodal large language models (MLLMs) are increasingly used to automate chart generation from data tables, enabling efficient data analysis and reporting but also introducing new misuse risks. In this work, we introduce ChartAttack, a novel framework for evaluating how MLLMs can be misused to generate misleading charts at scale. ChartAttack injects misleaders into chart designs, aiming to induce incorrect interpretations of the underlying data. Furthermore, we create AttackViz, a chart question-answering (QA) dataset where each (chart specification, QA) pair is labeled with effective misleaders and their induced incorrect answers. Experiments in in-domain and cross-domain settings show that ChartAttack significantly degrades the QA performance of MLLM readers, reducing accuracy by an average of 19.6 points and 14.9 points, respectively. A human study further shows an average 20.2 point drop in accuracy for participants exposed to misleading charts generated by ChartAttack. Our findings highlight an urgent need for robustness and security considerations in the design, evaluation, and deployment of MLLM-based chart generation systems. We make our code and data publicly available.

Key Contributions

ChartAttack: first automated framework for jailbreaking MLLMs to generate misleading charts via systematically injected visualization design violations (misleaders)
AttackViz: a multi-label chart QA dataset with structured annotations linking misleaders, chart specifications, and the incorrect answers each misleader induces
Empirical evaluation showing ChartAttack reduces MLLM QA accuracy by 19.6 pp (in-domain) and 14.9 pp (cross-domain), and human accuracy by 20.2 pp

🛡️ Threat Analysis

Details

Domains

multimodalvisionnlp

Model Types

vlmllmmultimodal

Threat Tags

black_boxinference_time

Datasets

AttackVizChartQA

Applications

chart generationdata visualizationmllm-based reporting systems

Read PDF arXiv DOI Code

ChartAttack: Testing the Vulnerability of LLMs to Malicious Prompting in Chart Generation

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Enhanced MLLM Black-Box Jailbreaking Attacks and Defenses

Models as Lego Builders: Assembling Malice from Benign Blocks via Semantic Blueprints

Jailbreaking Large Vision Language Models in Intelligent Transportation Systems

VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack

Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities

MIDAS: Multi-Image Dispersion and Semantic Reconstruction for Jailbreaking MLLMs

STaR-Attack: A Spatio-Temporal and Narrative Reasoning Attack Framework for Unified Multimodal Understanding and Generation Models

GAMBIT: A Gamified Jailbreak Framework for Multimodal Large Language Models