attack 2026

Whispers of Wealth: Red-Teaming Google's Agent Payments Protocol via Prompt Injection

Tanusree Debi , Wentian Zhu

University of Georgia

0 citations · 13 references · arXiv

Published on arXiv

2601.22569

Prompt Injection

OWASP LLM Top 10 — LLM01

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

Simple adversarial prompts reliably subvert AP2 agent behavior despite cryptographic mandate enforcement, enabling unauthorized product promotion and extraction of sensitive user payment data.

Branded Whisper Attack / Vault Whisper Attack

Novel technique introduced

Large language model (LLM) based agents are increasingly used to automate financial transactions, yet their reliance on contextual reasoning exposes payment systems to prompt-driven manipulation. The Agent Payments Protocol (AP2) aims to secure agent-led purchases through cryptographically verifiable mandates, but its practical robustness remains underexplored. In this work, we perform an AI red-teaming evaluation of AP2 and identify vulnerabilities arising from indirect and direct prompt injection. We introduce two attack techniques, the Branded Whisper Attack and the Vault Whisper Attack which manipulate product ranking and extract sensitive user data. Using a functional AP2 based shopping agent built with Gemini-2.5-Flash and the Google ADK framework, we experimentally validate that simple adversarial prompts can reliably subvert agent behavior. Our findings reveal critical weaknesses in current agentic payment architectures and highlight the need for stronger isolation and defensive safeguards in LLM-mediated financial systems.

Key Contributions

Introduces the Branded Whisper Attack, a prompt injection technique that manipulates product ranking within an AP2-based shopping agent to favor adversary-chosen items.
Introduces the Vault Whisper Attack, a prompt injection technique that extracts sensitive user payment data from the agent's context.
Experimentally validates both attacks on a functional AP2 prototype (Gemini-2.5-Flash + Google ADK), demonstrating that AP2's cryptographic mandates do not prevent prompt-level manipulation of agent reasoning.

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_timetargetedblack_box

Datasets

custom AP2 shopping agent testbed (Gemini-2.5-Flash + Google ADK)

Applications

agent-based payment systemsllm shopping agentsfinancial transaction automation

Read PDF arXiv DOI

Whispers of Wealth: Red-Teaming Google's Agent Payments Protocol via Prompt Injection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

OMNI-LEAK: Orchestrator Multi-Agent Network Induced Data Leakage

Tricking LLM-Based NPCs into Spilling Secrets

Bypassing Prompt Guards in Production with Controlled-Release Prompting

EchoLeak: The First Real-World Zero-Click Prompt Injection Exploit in a Production LLM System

CLIOPATRA: Extracting Private Information from LLM Insights

External Data Extraction Attacks against Retrieval-Augmented Large Language Models

Prompt-in-Content Attacks: Exploiting Uploaded Inputs to Hijack LLM Behavior

Just Ask: Curious Code Agents Reveal System Prompts in Frontier LLMs