Omer Hofman

Papers in Database (1)

benchmark arXiv Mar 15, 2026 ยท 22d ago

When Scanners Lie: Evaluator Instability in LLM Red-Teaming

Lidor Erez, Omer Hofman, Tamir Nizri et al.

Automated LLM red-teaming scanners show unstable vulnerability measurements due to unreliable evaluators, varying ASR by up to 33%

Prompt Injection nlp
PDF