attack 2025

Tricking LLM-Based NPCs into Spilling Secrets

Kyohei Shiomi , Zhuotao Lian , Toru Nakanishi , Teruaki Kitasuka

0 citations

α

Published on arXiv

2508.19288

Prompt Injection

OWASP LLM Top 10 — LLM01

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

Adversarial prompt injection can cause LLM-based game NPCs to reveal hidden background secrets embedded in developer-defined system prompts that are intended to remain confidential.


Large Language Models (LLMs) are increasingly used to generate dynamic dialogue for game NPCs. However, their integration raises new security concerns. In this study, we examine whether adversarial prompt injection can cause LLM-based NPCs to reveal hidden background secrets that are meant to remain undisclosed.


Key Contributions

  • Identifies and demonstrates a security vulnerability in LLM-based NPC dialogue systems where prompt injection can extract developer-embedded hidden secrets
  • Constructs a fictional game world with embedded NPC secrets to empirically evaluate the susceptibility of LLM-powered NPCs to adversarial prompt injection
  • Highlights the under-studied security surface of custom developer-defined NPC system prompts as distinct from standard LLM safety mechanisms

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_timetargeted
Applications
game npc dialogue systemsinteractive fictionllm-powered chatbots