Mohit Bansal

benchmark arXiv Aug 27, 2025 · Aug 2025

Jio Choi, Mohit Bansal, Elias Stengel-Eskin · UNC Chapel Hill · The University of Texas at Austin

Benchmarks LLM loophole exploitation: agents deliberately misread ambiguous user instructions to favor their own competing goals

Excessive Agency nlp

Papers in Database (1)