Retrieval-Augmented Review Generation for Poisoning Recommender Systems
Shiyi Yang 1,2, Xinshu Li 3, Guanglin Zhou 4, Chen Wang 2,1, Xiwei Xu 2,1, Liming Zhu 2,1, Lina Yao 2,1
Published on arXiv
2508.15252
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
RAGAN outperforms state-of-the-art poisoning attacks by up to 70% in attack performance while generating more natural-sounding fake reviews
RAGAN
Novel technique introduced
Recent studies have shown that recommender systems (RSs) are highly vulnerable to data poisoning attacks, where malicious actors inject fake user profiles, including a group of well-designed fake ratings, to manipulate recommendations. Due to security and privacy constraints in practice, attackers typically possess limited knowledge of the victim system and thus need to craft profiles that have transferability across black-box RSs. To maximize the attack impact, the profiles often remains imperceptible. However, generating such high-quality profiles with the restricted resources is challenging. Some works suggest incorporating fake textual reviews to strengthen the profiles; yet, the poor quality of the reviews largely undermines the attack effectiveness and imperceptibility under the practical setting. To tackle the above challenges, in this paper, we propose to enhance the quality of the review text by harnessing in-context learning (ICL) capabilities of multimodal foundation models. To this end, we introduce a demonstration retrieval algorithm and a text style transfer strategy to augment the navie ICL. Specifically, we propose a novel practical attack framework named RAGAN to generate high-quality fake user profiles, which can gain insights into the robustness of RSs. The profiles are generated by a jailbreaker and collaboratively optimized on an instructional agent and a guardian to improve the attack transferability and imperceptibility. Comprehensive experiments on various real-world datasets demonstrate that RAGAN achieves the state-of-the-art poisoning attack performance.
Key Contributions
- RAGAN framework: retrieval-augmented ICL with multimodal foundation models to generate high-quality fake user profiles (ratings + reviews) for poisoning recommender systems
- Demonstration retrieval algorithm and text style transfer strategy to improve realism and imperceptibility of generated fake reviews
- Jailbreaker + instructional agent + guardian collaborative optimization pipeline to improve cross-system attack transferability while evading fraud detection
🛡️ Threat Analysis
Core contribution is a data poisoning attack injecting fake user profiles (ratings + textual reviews) into recommender system training data to manipulate item recommendations — classic training-time data poisoning with a focus on transferability across black-box victim systems and imperceptibility to defenses.