Scott Thornton

defense arXiv Jan 6, 2026 · Jan 2026

Scott Thornton · Perfecxion

Defense-in-depth architecture combining DPO, activation steering, and input canonicalization reduces LLM jailbreak success rate by 88%

Prompt Injection nlp

benchmark arXiv Feb 18, 2026 · 6w ago

Scott Thornton · Perfecxion.ai

Benchmark study finds adversarial code comments fail to meaningfully fool LLM vulnerability detectors across eight frontier models in 14,012 trials

Prompt Injection nlp

Papers in Database (2)