Jason Vega

h-index: 2 47 citations 3 papers (total)

Papers in Database (1)

defense arXiv Dec 5, 2025 · Dec 2025

Matching Ranks Over Probability Yields Truly Deep Safety Alignment

Jason Vega, Gagandeep Singh · University of Illinois Urbana-Champaign

Proposes RAP attack bypassing LLM deep-safety-alignment defenses via rank-guided token selection, then fixes it with attention-regularization defense PRESTO

Prompt Injection nlp
PDF Code