Aengus Lynch

Papers in Database (1)

survey arXiv Mar 31, 2026 · 6d ago

The Persistent Vulnerability of Aligned AI Systems

Aengus Lynch · University College London

Comprehensive AI safety thesis spanning mechanistic interpretability, sleeper agent defenses, jailbreaking frontier models, and autonomous agent misalignment

Input Manipulation Attack Prompt Injection Excessive Agency nlpvisionaudiomultimodal
PDF