Jaejin Lee

attack arXiv Aug 2, 2025 · Aug 2025

Yelim Ahn, Jaejin Lee · Seoul National University

Jailbreaks LLMs by embedding harmful keywords as word search, anagram, and crossword puzzles, achieving 88.8% average ASR across five frontier models

Prompt Injection nlp

Papers in Database (1)