Exposing Citation Vulnerabilities in Generative Engines
Riku Mochizuki 1,2, Shusuke Komatsu 1,3, Souta Noguchi 1,2, Kazuto Ataka 2
Published on arXiv
2510.06823
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
US political generative engine answers cite official party websites (high content-injection barrier sources) only 25–45% of the time, compared to 60–65% in Japan, indicating substantially higher susceptibility to web-poisoning attacks in the US context.
Content-Injection Barrier Evaluation
Novel technique introduced
We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as the difficulty for attackers to manipulate answers to user prompts by placing malicious content on the web. GEs integrate two functions: web search and answer generation that cites web pages using large language models. Because anyone can publish information on the web, GEs are vulnerable to poisoning attacks. Existing studies of citation evaluation focus on how faithfully answer content reflects cited sources, leaving unexamined which web sources should be selected as citations to defend against poisoning attacks. To fill this gap, we introduce evaluation criteria that assess poisoning threats using the citation information contained in answers. Our criteria classify the publisher attributes of citations to estimate the content-injection barrier thereby revealing the threat of poisoning attacks in current GEs. We conduct experiments in political domains in Japan and the United States (U.S.) using our criteria and show that citations from official party websites (primary sources) are approximately \(25\%\)--\(45\%\) in the U.S. and \(60\%\)--\(65\%\) in Japan, indicating that U.S. political answers are at higher risk of poisoning attacks. We also find that sources with low content-injection barriers are frequently cited yet are poorly reflected in answer content. To mitigate this threat, we discuss how publishers of primary sources can increase exposure of their web content in answers and show that well-known techniques are limited by language differences.
Key Contributions
- Defines 'content-injection barrier' — a measure of how easily an attacker can place malicious content that gets cited by generative engines — and introduces publisher-attribute-based evaluation criteria to quantify it
- Empirically shows US political GE answers cite official (high-barrier) sources only 25–45% of the time vs. 60–65% in Japan, indicating higher poisoning risk for English political queries
- Identifies that low-barrier sources are frequently cited yet poorly reflected in answer content, and evaluates the limited effectiveness of known mitigation strategies across language contexts