defense 2025

EcoAlign: An Economically Rational Framework for Efficient LVLM Alignment

Ruoxi Cheng 1,2, Haoxuan Ma 2, Teng Ma 3, Hongyi Zhang 4

2 citations · arXiv

α

Published on arXiv

2511.11301

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

EcoAlign matches or surpasses state-of-the-art safety and utility across 3 closed-source and 2 open-source LVLMs at lower computational cost than existing alignment methods

EcoAlign

Novel technique introduced


Large Vision-Language Models (LVLMs) exhibit powerful reasoning capabilities but suffer sophisticated jailbreak vulnerabilities. Fundamentally, aligning LVLMs is not just a safety challenge but a problem of economic efficiency. Current alignment methods struggle with the trade-off between safety, utility, and operational costs. Critically, a focus solely on final outputs (process-blindness) wastes significant computational budget on unsafe deliberation. This flaw allows harmful reasoning to be disguised with benign justifications, thereby circumventing simple additive safety scores. To address this, we propose EcoAlign, an inference-time framework that reframes alignment as an economically rational search by treating the LVLM as a boundedly rational agent. EcoAlign incrementally expands a thought graph and scores actions using a forward-looking function (analogous to net present value) that dynamically weighs expected safety, utility, and cost against the remaining budget. To prevent deception, path safety is enforced via the weakest-link principle. Extensive experiments across 3 closed-source and 2 open-source models on 6 datasets show that EcoAlign matches or surpasses state-of-the-art safety and utility at a lower computational cost, thereby offering a principled, economical pathway to robust LVLM alignment.


Key Contributions

  • Reframes LVLM alignment as an economically rational search problem, treating the model as a boundedly rational agent with a computational budget
  • Proposes a forward-looking NPV-analogous scoring function that dynamically balances expected safety, utility, and cost over a thought graph during inference
  • Introduces the weakest-link principle for path-level safety enforcement to prevent deception via benign final-step justifications

🛡️ Threat Analysis


Details

Domains
visionnlpmultimodal
Model Types
vlmllmtransformer
Threat Tags
inference_time
Datasets
6 jailbreak/safety benchmark datasets (unspecified in abstract)
Applications
vision-language model safetylvlm jailbreak defense