benchmark 2025

Exposing Citation Vulnerabilities in Generative Engines

Riku Mochizuki ^1,2, Shusuke Komatsu ^1,3, Souta Noguchi ^1,2, Kazuto Ataka ²

¹ QueryLift Inc.

² Keio University

³ Nara Institute of Science and Technology

0 citations · 87 references · arXiv

Published on arXiv

2510.06823

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

US political generative engine answers cite official party websites (high content-injection barrier sources) only 25–45% of the time, compared to 60–65% in Japan, indicating substantially higher susceptibility to web-poisoning attacks in the US context.

Content-Injection Barrier Evaluation

Novel technique introduced

We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as the difficulty for attackers to manipulate answers to user prompts by placing malicious content on the web. GEs integrate two functions: web search and answer generation that cites web pages using large language models. Because anyone can publish information on the web, GEs are vulnerable to poisoning attacks. Existing studies of citation evaluation focus on how faithfully answer content reflects cited sources, leaving unexamined which web sources should be selected as citations to defend against poisoning attacks. To fill this gap, we introduce evaluation criteria that assess poisoning threats using the citation information contained in answers. Our criteria classify the publisher attributes of citations to estimate the content-injection barrier thereby revealing the threat of poisoning attacks in current GEs. We conduct experiments in political domains in Japan and the United States (U.S.) using our criteria and show that citations from official party websites (primary sources) are approximately \(25\%\)--\(45\%\) in the U.S. and \(60\%\)--\(65\%\) in Japan, indicating that U.S. political answers are at higher risk of poisoning attacks. We also find that sources with low content-injection barriers are frequently cited yet are poorly reflected in answer content. To mitigate this threat, we discuss how publishers of primary sources can increase exposure of their web content in answers and show that well-known techniques are limited by language differences.

Key Contributions

Defines 'content-injection barrier' — a measure of how easily an attacker can place malicious content that gets cited by generative engines — and introduces publisher-attribute-based evaluation criteria to quantify it
Empirically shows US political GE answers cite official (high-barrier) sources only 25–45% of the time vs. 60–65% in Japan, indicating higher poisoning risk for English political queries
Identifies that low-barrier sources are frequently cited yet poorly reflected in answer content, and evaluates the limited effectiveness of known mitigation strategies across language contexts

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_timeblack_box

Datasets

Political Q&A queries (Japan and US domains)

Applications

generative search enginesrag-based question answeringpolitical information retrieval

Read PDF arXiv DOI

Exposing Citation Vulnerabilities in Generative Engines

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

LJ-Bench: Ontology-Based Benchmark for U.S. Crime

RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models

Gaming the Answer Matcher: Examining the Impact of Text Manipulation on Automated Judgment

Quantifying CBRN Risk in Frontier Models

MalURLBench: A Benchmark Evaluating Agents' Vulnerabilities When Processing Web URLs

CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models

Vulnerability of LLMs' Belief Systems? LLMs Belief Resistance Check Through Strategic Persuasive Conversation Interventions

Cooking Up Risks: Benchmarking and Reducing Food Safety Risks in Large Language Models