Long H. Pham

Papers in Database (1)

defense arXiv Apr 27, 2026 · 24d ago

Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models

Nay Myat Min, Long H. Pham, Jun Sun · Singapore Management University

Tuning-free runtime monitor detecting backdoors, jailbreaks, and prompt injection by analyzing hidden-state convergence patterns across LLM layers

Model Poisoning Prompt Injection nlp
PDF