Neil Zhenqiang Gong

defense arXiv Oct 14, 2025 · Oct 2025

PromptLocate: Localizing Prompt Injection Attacks

Yuqi Jia, Yupei Liu, Zedian Shao et al. · Duke University · The Pennsylvania State University

First prompt injection localization method for LLMs, pinpointing injected instructions and data for post-attack forensics

Prompt Injection nlp

8 citations 1 influentialPDF

benchmark arXiv Oct 1, 2025 · Oct 2025

WAInjectBench: Benchmarking Prompt Injection Detections for Web Agents

Yinuo Liu, Ruohan Xu, Xilong Wang et al. · Duke University

Benchmarks prompt injection detection methods for web agents, exposing failures against instruction-free and imperceptible image attacks

Input Manipulation Attack Prompt Injection nlpvisionmultimodal

4 citations 1 influentialPDF Code

defense arXiv Sep 29, 2025 · Sep 2025

SecInfer: Preventing Prompt Injection via Inference-time Scaling

Yupei Liu, Yanting Wang, Yuqi Jia et al. · Penn State University · Duke University

Defends LLMs against prompt injection via multi-path sampling and task-guided aggregation at inference time

Prompt Injection nlp

3 citations 1 influentialPDF

attack arXiv Dec 10, 2025 · Dec 2025

ObliInjection: Order-Oblivious Prompt Injection Attack to LLM Agents with Multi-source Data

Reachal Wang, Yuqi Jia, Neil Zhenqiang Gong · Duke University

Gradient-optimized prompt injection attack on multi-source LLM agents that succeeds regardless of segment ordering in the input

Input Manipulation Attack Prompt Injection nlp

2 citations PDF Code

defense arXiv Feb 14, 2026 · 7w ago

AlignSentinel: Alignment-Aware Detection of Prompt Injection Attacks

Yuqi Jia, Ruiqi Wang, Xilong Wang et al. · Duke University · NVIDIA

Three-class attention-based classifier detects prompt injection by distinguishing misaligned, aligned, and non-instruction LLM inputs

Prompt Injection nlp

PDF

benchmark arXiv Feb 12, 2026 · 7w ago

MalTool: Malicious Tool Attacks on LLM Agents

Yuepeng Hu, Yuqi Jia, Mengyuan Li et al. · Duke University · UC Berkeley

Benchmarks malicious tool code attacks on LLM agents; coding LLMs generate evasive malware that defeats VirusTotal and agent-specific detectors

AI Supply Chain Attacks Insecure Plugin Design nlp

PDF

defense arXiv Feb 3, 2026 · 8w ago

WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web Agents

Xilong Wang, Yinuo Liu, Zhun Wang et al. · Duke University · UC Berkeley

Defends LLM web agents against indirect prompt injection by detecting and localizing malicious webpage segments

Prompt Injection nlp

PDF Code

defense arXiv Oct 15, 2025 · Oct 2025

PIShield: Detecting Prompt Injection Attacks via Intrinsic LLM Features

Wei Zou, Yupei Liu, Yanting Wang et al. · Pennsylvania State University · Duke University

Detects prompt injection in LLM applications using residual-stream representations and a lightweight linear classifier

Prompt Injection nlp

PDF

Papers in Database (8)