Scott Thornton

h-index: 3 316 citations 7 papers (total)

Papers in Database (2)

defense arXiv Jan 6, 2026 · Jan 2026

TRYLOCK: Defense-in-Depth Against LLM Jailbreaks via Layered Preference and Representation Engineering

Scott Thornton · Perfecxion

Defense-in-depth architecture combining DPO, activation steering, and input canonicalization reduces LLM jailbreak success rate by 88%

Prompt Injection nlp
PDF
benchmark arXiv Feb 18, 2026 · 6w ago

Can Adversarial Code Comments Fool AI Security Reviewers -- Large-Scale Empirical Study of Comment-Based Attacks and Defenses Against LLM Code Analysis

Scott Thornton · Perfecxion.ai

Benchmark study finds adversarial code comments fail to meaningfully fool LLM vulnerability detectors across eight frontier models in 14,012 trials

Prompt Injection nlp
PDF