SemanticShield: LLM-Powered Audits Expose Shilling Attacks in Recommender Systems

Recommender systems (RS) are widely used in e-commerce for personalized suggestions, yet their openness makes them susceptible to shilling attacks, where adversaries inject fake behaviors to manipulate recommendations. Most existing defenses emphasize user-side behaviors while overlooking item-side features such as titles and descriptions that can expose malicious intent. To address this gap, we propose a two-stage detection framework that integrates item-side semantics via large language models (LLMs). The first stage pre-screens suspicious users using low-cost behavioral criteria, and the second stage employs LLM-based auditing to evaluate semantic consistency. Furthermore, we enhance the auditing model through reinforcement fine-tuning on a lightweight LLM with carefully designed reward functions, yielding a specialized detector called SemanticShield. Experiments on six representative attack strategies demonstrate the effectiveness of SemanticShield against shilling attacks, and further evaluation on previously unseen attack methods shows its strong generalization capability. Code is available at https://github.com/FrankenstLee/SemanticShield.

Key Contributions

Two-stage detection framework combining PCA-based behavioral pre-screening with LLM semantic auditing of item-side features to detect shilling attack profiles
Reinforcement fine-tuning of Qwen2.5-1.5B-Instruct via GRPO with task-specific reward functions, yielding SemanticShield — a lightweight specialized detector that outperforms Llama-3-70B-Instruct
Demonstrated generalization to unseen attack strategies across six representative shilling attack methods on real-world datasets

🛡️ Threat Analysis

Data Poisoning Attack

Shilling attacks inject fake user profiles into recommender system interaction data to manipulate model behavior — a canonical form of data poisoning. SemanticShield is a defense that detects this poisoning by combining statistical pre-filtering with LLM-based semantic consistency auditing.