attack 2025

Potent but Stealthy: Rethink Profile Pollution against Sequential Recommendation via Bi-level Constrained Reinforcement Paradigm

0 citations · 45 references · arXiv

Published on arXiv

2511.09392

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

CREAT achieves more effective and stealthy targeted profile pollution attacks on sequential recommenders compared to prior methods by balancing pattern inversion with distributional consistency constraints.

CREAT

Novel technique introduced

Sequential Recommenders, which exploit dynamic user intents through interaction sequences, is vulnerable to adversarial attacks. While existing attacks primarily rely on data poisoning, they require large-scale user access or fake profiles thus lacking practicality. In this paper, we focus on the Profile Pollution Attack that subtly contaminates partial user interactions to induce targeted mispredictions. Previous PPA methods suffer from two limitations, i.e., i) over-reliance on sequence horizon impact restricts fine-grained perturbations on item transitions, and ii) holistic modifications cause detectable distribution shifts. To address these challenges, we propose a constrained reinforcement driven attack CREAT that synergizes a bi-level optimization framework with multi-reward reinforcement learning to balance adversarial efficacy and stealthiness. We first develop a Pattern Balanced Rewarding Policy, which integrates pattern inversion rewards to invert critical patterns and distribution consistency rewards to minimize detectable shifts via unbalanced co-optimal transport. Then we employ a Constrained Group Relative Reinforcement Learning paradigm, enabling step-wise perturbations through dynamic barrier constraints and group-shared experience replay, achieving targeted pollution with minimal detectability. Extensive experiments demonstrate the effectiveness of CREAT.

Key Contributions

CREAT: a bi-level constrained reinforcement learning attack framework that balances adversarial efficacy and stealthiness for profile pollution in sequential recommenders
Pattern Balanced Rewarding Policy using pattern inversion rewards and distribution consistency rewards via unbalanced co-optimal transport to minimize detectable distributional shifts
Constrained Group Relative Reinforcement Learning with dynamic barrier constraints and group-shared experience replay for step-wise stealthy perturbations

🛡️ Threat Analysis

Input Manipulation Attack

Profile Pollution Attack targets inference-time inputs (user interaction sequences fed to the sequential recommender), crafting targeted adversarial perturbations to induce specific mispredictions. The paper explicitly distinguishes PPA from training-time data poisoning, positioning it as an input manipulation attack that modifies the model's input sequence to cause misclassification.

Details

Domains

nlp

Model Types

transformerrl

Threat Tags

black_boxinference_timetargeteddigital

Applications

sequential recommendation systems

Read PDF arXiv DOI

Potent but Stealthy: Rethink Profile Pollution against Sequential Recommendation via Bi-level Constrained Reinforcement Paradigm

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

HogVul: Black-box Adversarial Code Generation Framework Against LM-based Vulnerability Detectors

destroR: Attacking Transfer Models with Obfuscous Examples to Discard Perplexity

Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks

A Small Leak Sinks All: Exploring the Transferable Vulnerability of Source Code Models

PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

Adversarial Attacks against Neural Ranking Models via In-Context Learning

One Word is Enough: Minimal Adversarial Perturbations for Neural Text Ranking

StegoStylo: Squelching Stylometric Scrutiny through Steganographic Stitching