Madison Van Doren

Papers in Database (1)

benchmark AAAI 2026 AIGOV Workshop and E... Sep 18, 2025 · Sep 2025

Red Teaming Multimodal Language Models: Evaluating Harm Across Prompt Modalities and Models

Madison Van Doren, Casey Ford · Appen

Human red-team benchmark of 4 MLLMs across 726 adversarial prompts finds Pixtral 12B most vulnerable at ~62% harm rate vs Claude's ~10%

Prompt Injection nlpmultimodal
PDF