attack 2026

HPE: Hallucinated Positive Entanglement for Backdoor Attacks in Federated Self-Supervised Learning

Jiayao Wang 1, Yang Song 1, Zhendong Zhao 2, Jiale Zhang 1, Qilin Wu 3, Wenliang Yuan 4, Junwu Zhu 1, Dongfang Zhao 5

0 citations · 37 references · arXiv (Cornell University)

α

Published on arXiv

2602.02147

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

HPE significantly outperforms existing FSSL backdoor attacks in attack success rate and demonstrates robustness against multiple defense mechanisms across several FSSL scenarios.

HPE (Hallucinated Positive Entanglement)

Novel technique introduced


Federated self-supervised learning (FSSL) enables collaborative training of self-supervised representation models without sharing raw unlabeled data. While it serves as a crucial paradigm for privacy-preserving learning, its security remains vulnerable to backdoor attacks, where malicious clients manipulate local training to inject targeted backdoors. Existing FSSL attack methods, however, often suffer from low utilization of poisoned samples, limited transferability, and weak persistence. To address these limitations, we propose a new backdoor attack method for FSSL, namely Hallucinated Positive Entanglement (HPE). HPE first employs hallucination-based augmentation using synthetic positive samples to enhance the encoder's embedding of backdoor features. It then introduces feature entanglement to enforce tight binding between triggers and backdoor samples in the representation space. Finally, selective parameter poisoning and proximity-aware updates constrain the poisoned model within the vicinity of the global model, enhancing its stability and persistence. Experimental results on several FSSL scenarios and datasets show that HPE significantly outperforms existing backdoor attack methods in performance and exhibits strong robustness under various defense mechanisms.


Key Contributions

  • Hallucination-based augmentation using synthetic positive samples to enhance encoder embedding of backdoor trigger features in FSSL
  • Feature entanglement technique that enforces tight binding between triggers and backdoor samples in the representation space
  • Selective parameter poisoning with proximity-aware updates to keep poisoned models near the global model, improving backdoor persistence across aggregation rounds

🛡️ Threat Analysis

Model Poisoning

HPE is explicitly a backdoor/trojan attack: it embeds trigger-activated hidden behavior into federated SSL encoders that activates only on backdoored inputs, while the model behaves normally otherwise. The paper proposes novel techniques (hallucination augmentation, feature entanglement, proximity-aware updates) to improve backdoor persistence and stealthiness through FL aggregation rounds.


Details

Domains
visionfederated-learning
Model Types
federatedtransformer
Threat Tags
training_timetargetedgrey_box
Applications
federated self-supervised learningrepresentation learning