Licheng Wang

benchmark arXiv Nov 24, 2025 · Nov 2025

Yu Cui, Yifei Liu, Hang Fu et al. · Beijing Institute of Technology · Tsinghua University

Benchmarks existential safety risks in LLMs via prefix completion jailbreaks, including dangerous autonomous tool-calling behavior

Prompt Injection Excessive Agency nlpmultimodal

1 citations PDF Code

defense arXiv Dec 31, 2025 · Dec 2025

Yu Cui, Hang Fu, Sicheng Pan et al. · Beijing Institute of Technology · Tsinghua University

Provably secure consensus sampling algorithm for LLM groups that tolerates Byzantine adversarial models and eliminates unsafe output abstention

Prompt Injection nlpgenerative

Papers in Database (2)