Cengiz Pehlevan

Papers in Database (1)

benchmark arXiv Mar 11, 2026 · 26d ago

Jailbreak Scaling Laws for Large Language Models: Polynomial-Exponential Crossover

Indranil Halder, Annesya Banerjee, Cengiz Pehlevan · Harvard University · Massachusetts Institute of Technology

Derives polynomial-to-exponential scaling law for jailbreak success under adversarial prompt injection using spin-glass theory

Prompt Injection nlp
PDF Code