Jinyuan Jia

defense arXiv Apr 8, 2026 · 6w ago

TRUSTDESC: Preventing Tool Poisoning in LLM Applications via Trusted Description Generation

Hengkai Ye, Zhechang Zhang, Jinyuan Jia et al. · The Pennsylvania State University

Prevents LLM tool poisoning by auto-generating trusted tool descriptions from source code via static analysis and dynamic verification

Prompt Injection Insecure Plugin Design nlp

PDF

defense arXiv Jan 7, 2025 · Jan 2025

TrojanDec: Data-free Detection of Trojan Inputs in Self-supervised Learning

Yupei Liu, Yanting Wang, Jinyuan Jia · The Pennsylvania State University

Data-free defense that detects and removes trojan triggers from test inputs in self-supervised learning encoders

Model Poisoning vision

PDF

attack arXiv Mar 13, 2026 · 9w ago

PISmith: Reinforcement Learning-based Red Teaming for Prompt Injection Defenses

Chenlong Yin, Runpeng Geng, Yanting Wang et al. · The Pennsylvania State University

RL-based adaptive prompt injection attack that systematically breaks state-of-the-art LLM defenses using entropy regularization and advantage weighting

Prompt Injection Red-Team Agents nlp

PDF Code

defense arXiv Apr 1, 2026 · 7w ago

AgentWatcher: A Rule-based Prompt Injection Monitor

Yanting Wang, Wei Zou, Runpeng Geng et al. · The Pennsylvania State University

Rule-based prompt injection detector using causal attribution to identify malicious context segments in long-context LLM agents

Prompt Injection Excessive Agency nlp

PDF Code

tool arXiv Apr 30, 2026 · 21d ago

FlashRT: Towards Computationally and Memory Efficient Red-Teaming for Prompt Injection and Knowledge Corruption

Yanting Wang, Chenlong Yin, Ying Chen et al. · The Pennsylvania State University

Efficient red-teaming framework achieving 2-7x speedup for optimization-based prompt injection and knowledge corruption attacks on long-context LLMs

Prompt Injection Red-Team Agents Benchmarks & Evaluation nlp

PDF Code

benchmark arXiv Apr 9, 2026 · 6w ago

PIArena: A Platform for Prompt Injection Evaluation

Runpeng Geng, Chenlong Yin, Yanting Wang et al. · The Pennsylvania State University

Unified benchmark platform for evaluating prompt injection attacks and defenses across diverse datasets with adaptive strategy-based attacks

Prompt Injection nlp

PDF Code

Papers in Database (6)

TRUSTDESC: Preventing Tool Poisoning in LLM Applications via Trusted Description Generation

TrojanDec: Data-free Detection of Trojan Inputs in Self-supervised Learning

PISmith: Reinforcement Learning-based Red Teaming for Prompt Injection Defenses

AgentWatcher: A Rule-based Prompt Injection Monitor

FlashRT: Towards Computationally and Memory Efficient Red-Teaming for Prompt Injection and Knowledge Corruption

PIArena: A Platform for Prompt Injection Evaluation