attack 2026

Controlling Output Rankings in Generative Engines for LLM-based Search

0 citations · 24 references · arXiv (Cornell University)

Published on arXiv

2602.03608

Input Manipulation Attack

OWASP ML Top 10 — ML01

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Achieves 91.4% Promotion Success Rate @Top-5 and 80.3% @Top-1 across 15 product categories on four major LLMs, outperforming existing ranking manipulation methods.

CORE

Novel technique introduced

The way customers search for and choose products is changing with the rise of large language models (LLMs). LLM-based search, or generative engines, provides direct product recommendations to users, rather than traditional online search results that require users to explore options themselves. However, these recommendations are strongly influenced by the initial retrieval order of LLMs, which disadvantages small businesses and independent creators by limiting their visibility. In this work, we propose CORE, an optimization method that \textbf{C}ontrols \textbf{O}utput \textbf{R}ankings in g\textbf{E}nerative Engines for LLM-based search. Since the LLM's interactions with the search engine are black-box, CORE targets the content returned by search engines as the primary means of influencing output rankings. Specifically, CORE optimizes retrieved content by appending strategically designed optimization content to steer the ranking of outputs. We introduce three types of optimization content: string-based, reasoning-based, and review-based, demonstrating their effectiveness in shaping output rankings. To evaluate CORE in realistic settings, we introduce ProductBench, a large-scale benchmark with 15 product categories and 200 products per category, where each product is associated with its top-10 recommendations collected from Amazon's search interface. Extensive experiments on four LLMs with search capabilities (GPT-4o, Gemini-2.5, Claude-4, and Grok-3) demonstrate that CORE achieves an average Promotion Success Rate of \textbf{91.4\% @Top-5}, \textbf{86.6\% @Top-3}, and \textbf{80.3\% @Top-1}, across 15 product categories, outperforming existing ranking manipulation methods while preserving the fluency of optimized content.

Key Contributions

CORE: a black-box optimization method that appends three types of crafted content (string-based, reasoning-based, review-based) to product pages to steer LLM search ranking outputs
ProductBench: a large-scale benchmark with 15 product categories and 200 products each, with Amazon top-10 recommendations as ground truth
Demonstrates 91.4% Promotion Success Rate @Top-5 and 80.3% @Top-1 across GPT-4o, Gemini-2.5, Claude-4, and Grok-3, outperforming prior ranking manipulation baselines

🛡️ Threat Analysis

Input Manipulation Attack

CORE adversarially crafts content appended to product/web pages that, when retrieved by LLM-based search, steers the model's output rankings — matching the explicitly listed dual-tag case of 'adversarial SEO poisoning for LLM search engines, adversarial document injection for RAG' where inputs are strategically crafted to manipulate LLM-integrated system outputs.

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_timetargeted

Datasets

ProductBenchAmazon search interface

Applications

llm-based search enginese-commerce product recommendations

Read PDF arXiv DOI

Controlling Output Rankings in Generative Engines for LLM-based Search

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models

Overcoming the Retrieval Barrier: Indirect Prompt Injection in the Wild for LLM Systems

Layer-Wise Perturbations via Sparse Autoencoders for Adversarial Text Generation

ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation

Trust Me, I Know This Function: Hijacking LLM Static Analysis using Bias

Token-Level Precise Attack on RAG: Searching for the Best Alternatives to Mislead Generation

Dynamic Target Attack

CoTDeceptor:Adversarial Code Obfuscation Against CoT-Enhanced LLM Code Agents