Mingjie Li

Papers in Database (2)

benchmark arXiv Aug 28, 2025 · Aug 2025

JADES: A Universal Framework for Jailbreak Assessment via Decompositional Scoring

Junjie Chu, Mingjie Li, Ziqing Yang et al. · CISPA Helmholtz Center for Information Security · Xi’an Jiaotong University

Benchmark framework using decompositional scoring to evaluate LLM jailbreak success, achieving 98.5% human agreement and exposing attack overestimation

Prompt Injection nlp
PDF Code
defense arXiv Apr 17, 2026 · 4w ago

Pruning Unsafe Tickets: A Resource-Efficient Framework for Safer and More Robust LLMs

Wai Man Si, Mingjie Li, Michael Backes et al. · CISPA Helmholtz Center for Information Security

Prunes model parameters responsible for unsafe LLM outputs, reducing harmful generations and jailbreak success with minimal utility loss

Prompt Injection nlpmultimodal
PDF