Cong Wang

h-index: 4 79 citations 10 papers (total)

Papers in Database (1)

attack arXiv Jan 29, 2026 · 9w ago

Just Ask: Curious Code Agents Reveal System Prompts in Frontier LLMs

Xiang Zheng, Yutao Wu, Hanxun Huang et al. · City University of Hong Kong · Deakin University +4 more

Self-evolving agent framework extracts hidden system prompts from 41 commercial LLMs using UCB-guided natural language probing strategies

Sensitive Information Disclosure Prompt Injection nlp
PDF