Casey Ford

h-index: 0 0 citations 3 papers (total)

Papers in Database (1)

benchmark arXiv Feb 4, 2026 · 8w ago

Alignment Drift in Multimodal LLMs: A Two-Phase, Longitudinal Evaluation of Harm Across Eight Model Releases

Casey Ford, Madison Van Doren, Emily Dix · Appen

Longitudinal red-team benchmark reveals unstable alignment across MLLM generations, with GPT and Claude showing increased attack success rates over time

Prompt Injection nlpmultimodal
PDF