Long H. Pham

h-index: 8 552 citations 26 papers (total)

Papers in Database (1)

attack arXiv Jan 19, 2026 · 11w ago

CORVUS: Red-Teaming Hallucination Detectors via Internal Signal Camouflage in Large Language Models

Nay Myat Min, Long H. Pham, Hongyu Zhang et al. · Singapore Management University · Chongqing University

Attacks LLM hallucination detectors by fine-tuning LoRA adapters to camouflage internal uncertainty, hidden-state, and attention signals

Output Integrity Attack nlp
PDF