Kamilė Lukošiūtė

attack arXiv Oct 7, 2025 · Oct 2025

Raffaele Mura, Giorgio Piras, Kamilė Lukošiūtė et al. · University of Cagliari · Centre for AI Governance +1 more

White-box LLM jailbreak using latent-space-guided word substitutions to produce low-perplexity prompts that evade perplexity-based safety filters

Prompt Injection nlp

1 citations PDF

Papers in Database (1)