Cong Wang

attack arXiv Jan 29, 2026 · 9w ago

Xiang Zheng, Yutao Wu, Hanxun Huang et al. · City University of Hong Kong · Deakin University +4 more

Self-evolving agent framework extracts hidden system prompts from 41 commercial LLMs using UCB-guided natural language probing strategies

Sensitive Information Disclosure Prompt Injection nlp

Papers in Database (1)