Rom Himelstein

h-index: 2 13 citations 6 papers (total)

Papers in Database (2)

benchmark arXiv Nov 5, 2025 · Nov 2025

Rom Himelstein, Amit LeVi, Brit Youngmann et al. · Technion - Israel Institute of Technology

Benchmark reveals hidden LLM biases masked by safety alignment using activation steering to bypass refusals

Prompt Injection nlp

2 citations PDF Code

defense arXiv Feb 1, 2026 · 9w ago

Eliron Rahimi, Elad Hirshel, Rom Himelstein et al. · Technion - Israel Institute of Technology · Ben-Gurion University of the Negev +1 more

Defends AR and diffusion LLMs against jailbreaks via SRI signal detecting incomplete internal recovery with 100× lower overhead

Prompt Injection nlp