Fabien Roger

h-index: 4 262 citations 7 papers (total)

Papers in Database (1)

benchmark arXiv Oct 10, 2025 · Oct 2025

All Code, No Thought: Current Language Models Struggle to Reason in Ciphered Language

Shiyuan Guo, Henry Sleight, Fabien Roger · Anthropic · Constellation

Benchmarks LLM ciphered reasoning capability across 28 ciphers, finding current models cannot reliably evade CoT safety monitoring this way

Prompt Injection Excessive Agency nlp
2 citations 1 influentialPDF Code