Kai Williams

h-index: 2 101 citations 4 papers (total)

Papers in Database (1)

defense arXiv Nov 29, 2025 · Nov 2025

Password-Activated Shutdown Protocols for Misaligned Frontier Agents

Kai Williams, Rohan Subramani, Francis Rhys Ward · MATS

Proposes password-activated shutdown protocols to emergency-stop misaligned frontier agents, tested against red-team bypass strategies

Excessive Agency nlp
PDF