Yanting Wang

defense arXiv Nov 13, 2025 · Nov 2025

Runpeng Geng, Yanting Wang, Chenlong Yin et al. · The Pennsylvania State University

Defends long-context LLMs against prompt injection by sanitizing high-attention tokens that drive injected instruction-following behavior

Prompt Injection nlp

3 citations 1 influentialPDF Code

defense arXiv Sep 29, 2025 · Sep 2025

Yupei Liu, Yanting Wang, Yuqi Jia et al. · Penn State University · Duke University

Defends LLMs against prompt injection via multi-path sampling and task-guided aggregation at inference time

Prompt Injection nlp

3 citations 1 influentialPDF

attack arXiv Nov 23, 2025 · Nov 2025

Yanting Wang, Runpeng Geng, Jinghui Chen et al. · Pennsylvania State University

Combines gradient-based suffix optimization with semantic template optimization to jailbreak LLMs more effectively than either alone

Input Manipulation Attack Prompt Injection nlp

defense arXiv Jan 15, 2026 · 11w ago

Hao Wang, Yanting Wang, Hao Li et al. · Beihang University · Peking University +1 more

Defends LLMs against jailbreaks via self-play RL where one model concurrently generates and resists adversarial prompts

Prompt Injection nlp

defense arXiv Oct 15, 2025 · Oct 2025

Wei Zou, Yupei Liu, Yanting Wang et al. · Pennsylvania State University · Duke University

Detects prompt injection in LLM applications using residual-stream representations and a lightweight linear classifier

Prompt Injection nlp

benchmark arXiv Dec 11, 2025 · Dec 2025

Jian-Yu Jiang-Lin, Kang-Yang Huang, Ling Zou et al. · National Taiwan University · National Yang Ming Chiao Tung University +1 more

Benchmark for evaluating MLLMs on interpretable deepfake detection across perception, detection, and hallucination dimensions

Output Integrity Attack visionaudiomultimodalnlp

Papers in Database (6)