Dionysis Kalogerias

Papers in Database (1)

attack arXiv Apr 28, 2026 · 23d ago

Test-Time Safety Alignment

Baturay Saglam, Dionysis Kalogerias · Yale University

Gradient-based embedding optimization that bypasses LLM safety alignment to neutralize refusals on harmful queries

Input Manipulation Attack Prompt Injection nlp
PDF