ML Security Papers

attack 2025

Tricking LLM-Based NPCs into Spilling Secrets

Kyohei Shiomi , Zhuotao Lian , Toru Nakanishi , Teruaki Kitasuka

Hiroshima University

0 citations

α

Published on arXiv

2508.19288

Prompt Injection

OWASP LLM Top 10 — LLM01

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

Adversarial prompt injection can cause LLM-based game NPCs to reveal hidden background secrets embedded in developer-defined system prompts that are intended to remain confidential.

Large Language Models (LLMs) are increasingly used to generate dynamic dialogue for game NPCs. However, their integration raises new security concerns. In this study, we examine whether adversarial prompt injection can cause LLM-based NPCs to reveal hidden background secrets that are meant to remain undisclosed.

Key Contributions

Identifies and demonstrates a security vulnerability in LLM-based NPC dialogue systems where prompt injection can extract developer-embedded hidden secrets
Constructs a fictional game world with embedded NPC secrets to empirically evaluate the susceptibility of LLM-powered NPCs to adversarial prompt injection
Highlights the under-studied security surface of custom developer-defined NPC system prompts as distinct from standard LLM safety mechanisms

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_timetargeted

Applications

game npc dialogue systemsinteractive fictionllm-powered chatbots

Similar Papers

Bypassing Prompt Guards in Production with Controlled-Release Prompting

Whispers of Wealth: Red-Teaming Google's Agent Payments Protocol via Prompt Injection

OMNI-LEAK: Orchestrator Multi-Agent Network Induced Data Leakage

EchoLeak: The First Real-World Zero-Click Prompt Injection Exploit in a Production LLM System

CLIOPATRA: Extracting Private Information from LLM Insights

External Data Extraction Attacks against Retrieval-Augmented Large Language Models

Prompt-in-Content Attacks: Exploiting Uploaded Inputs to Hijack LLM Behavior

Just Ask: Curious Code Agents Reveal System Prompts in Frontier LLMs