Evangelos E. Papalexakis

defense arXiv Oct 8, 2025 · Oct 2025

Sri Durga Sai Sowmya Kadali, Evangelos E. Papalexakis · University of California

Detects jailbreaks by analyzing hidden-layer representations of GPT-J and Mamba2 via tensor decomposition

Prompt Injection nlp

1 citations PDF

defense arXiv Feb 12, 2026 · 7w ago

Sri Durga Sai Sowmya Kadali, Evangelos E. Papalexakis · University of California

Detects and disrupts LLM jailbreaks at inference time using tensor decomposition of internal layer activations

Prompt Injection nlp

Papers in Database (2)