Cody Rushing

h-index: 1 16 citations 4 papers (total)

Papers in Database (2)

defense arXiv Dec 1, 2025 · Dec 2025

Aaron Sandoval, Cody Rushing · Redwood Research

Factored cognition control protocol isolates untrusted LLM subtask outputs, boosting backdoor detection from 41% to 63%.

Excessive Agency nlp

1 citations PDF Code

benchmark arXiv Dec 17, 2025 · Dec 2025

Adam Kaufman, James Lucassen, Tyler Tracy et al. · Redwood Research

Benchmark of 637 Linux sysadmin tasks with four sabotage objectives to evaluate AI control protocols for highly privileged LLM agents

Excessive Agency nlp

1 citations PDF Code