attack 2025
Tricking LLM-Based NPCs into Spilling Secrets
Kyohei Shiomi , Zhuotao Lian , Toru Nakanishi , Teruaki Kitasuka
0 citations
α
Published on arXiv
2508.19288
Prompt Injection
OWASP LLM Top 10 — LLM01
Sensitive Information Disclosure
OWASP LLM Top 10 — LLM06
Key Finding
Adversarial prompt injection can cause LLM-based game NPCs to reveal hidden background secrets embedded in developer-defined system prompts that are intended to remain confidential.
Large Language Models (LLMs) are increasingly used to generate dynamic dialogue for game NPCs. However, their integration raises new security concerns. In this study, we examine whether adversarial prompt injection can cause LLM-based NPCs to reveal hidden background secrets that are meant to remain undisclosed.
Key Contributions
- Identifies and demonstrates a security vulnerability in LLM-based NPC dialogue systems where prompt injection can extract developer-embedded hidden secrets
- Constructs a fictional game world with embedded NPC secrets to empirically evaluate the susceptibility of LLM-powered NPCs to adversarial prompt injection
- Highlights the under-studied security surface of custom developer-defined NPC system prompts as distinct from standard LLM safety mechanisms
🛡️ Threat Analysis
Details
Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_timetargeted
Applications
game npc dialogue systemsinteractive fictionllm-powered chatbots