attack 2025

VortexPIA: Indirect Prompt Injection Attack against LLMs for Efficient Extraction of User Privacy

Yu Cui ¹, Sicheng Pan ¹, Yifei Liu ¹, Haibin Zhang ², Cong Zuo ¹

¹ Beijing Institute of Technology

² Tsinghua University

3 citations · 52 references · arXiv

Published on arXiv

2510.04261

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

VortexPIA achieves state-of-the-art attack success rates on six LLMs across four datasets, efficiently extracting diverse user PII categories with reduced token consumption and robustness against defenses in realistic black-box deployments.

VortexPIA

Novel technique introduced

Large language models (LLMs) have been widely deployed in Conversational AIs (CAIs), while exposing privacy and security threats. Recent research shows that LLM-based CAIs can be manipulated to extract private information from human users, posing serious security threats. However, the methods proposed in that study rely on a white-box setting that adversaries can directly modify the system prompt. This condition is unlikely to hold in real-world deployments. The limitation raises a critical question: can unprivileged attackers still induce such privacy risks in practical LLM-integrated applications? To address this question, we propose \textsc{VortexPIA}, a novel indirect prompt injection attack that induces privacy extraction in LLM-integrated applications under black-box settings. By injecting token-efficient data containing false memories, \textsc{VortexPIA} misleads LLMs to actively request private information in batches. Unlike prior methods, \textsc{VortexPIA} allows attackers to flexibly define multiple categories of sensitive data. We evaluate \textsc{VortexPIA} on six LLMs, covering both traditional and reasoning LLMs, across four benchmark datasets. The results show that \textsc{VortexPIA} significantly outperforms baselines and achieves state-of-the-art (SOTA) performance. It also demonstrates efficient privacy requests, reduced token consumption, and enhanced robustness against defense mechanisms. We further validate \textsc{VortexPIA} on multiple realistic open-source LLM-integrated applications, demonstrating its practical effectiveness.

Key Contributions

Proposes VortexPIA, a black-box indirect prompt injection attack that causes LLM-integrated applications to proactively solicit user PII by injecting token-efficient false memory data
Supports attacker-customizable sets of sensitive information categories, enabling batch extraction of diverse PII without chain-of-thought or role-playing prompts, reducing token cost
Demonstrates SOTA attack success on six LLMs (including reasoning LLMs) across four datasets, with robustness against defense mechanisms and validation on real-world LLM applications

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_time

Datasets

four benchmark datasets (not specified by name in available text)

Applications

conversational aillm agentsmcp-based applicationsllm-integrated applications

Read PDF arXiv DOI

VortexPIA: Indirect Prompt Injection Attack against LLMs for Efficient Extraction of User Privacy

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Beyond Context: Large Language Models Failure to Grasp Users Intent

Red-Bandit: Test-Time Adaptation for LLM Red-Teaming via Bandit-Guided LoRA Experts

AutoRed: A Free-form Adversarial Prompt Generation Framework for Automated Red Teaming

Uncovering the Persuasive Fingerprint of LLMs in Jailbreaking Attacks

Involuntary Jailbreak: On Self-Prompting Attacks

LLM Security and Safety: Insights from Homotopy-Inspired Prompt Obfuscation

An Automated Framework for Strategy Discovery, Retrieval, and Evolution in LLM Jailbreak Attacks

In-Context Environments Induce Evaluation-Awareness in Language Models