attack 2026

HPE: Hallucinated Positive Entanglement for Backdoor Attacks in Federated Self-Supervised Learning

0 citations · 37 references · arXiv (Cornell University)

Published on arXiv

2602.02147

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

HPE significantly outperforms existing FSSL backdoor attacks in attack success rate and demonstrates robustness against multiple defense mechanisms across several FSSL scenarios.

HPE (Hallucinated Positive Entanglement)

Novel technique introduced

Federated self-supervised learning (FSSL) enables collaborative training of self-supervised representation models without sharing raw unlabeled data. While it serves as a crucial paradigm for privacy-preserving learning, its security remains vulnerable to backdoor attacks, where malicious clients manipulate local training to inject targeted backdoors. Existing FSSL attack methods, however, often suffer from low utilization of poisoned samples, limited transferability, and weak persistence. To address these limitations, we propose a new backdoor attack method for FSSL, namely Hallucinated Positive Entanglement (HPE). HPE first employs hallucination-based augmentation using synthetic positive samples to enhance the encoder's embedding of backdoor features. It then introduces feature entanglement to enforce tight binding between triggers and backdoor samples in the representation space. Finally, selective parameter poisoning and proximity-aware updates constrain the poisoned model within the vicinity of the global model, enhancing its stability and persistence. Experimental results on several FSSL scenarios and datasets show that HPE significantly outperforms existing backdoor attack methods in performance and exhibits strong robustness under various defense mechanisms.

Key Contributions

Hallucination-based augmentation using synthetic positive samples to enhance encoder embedding of backdoor trigger features in FSSL
Feature entanglement technique that enforces tight binding between triggers and backdoor samples in the representation space
Selective parameter poisoning with proximity-aware updates to keep poisoned models near the global model, improving backdoor persistence across aggregation rounds

🛡️ Threat Analysis

Model Poisoning

HPE is explicitly a backdoor/trojan attack: it embeds trigger-activated hidden behavior into federated SSL encoders that activates only on backdoored inputs, while the model behaves normally otherwise. The paper proposes novel techniques (hallucination augmentation, feature entanglement, proximity-aware updates) to improve backdoor persistence and stealthiness through FL aggregation rounds.

Details

Domains

visionfederated-learning

Model Types

federatedtransformer

Threat Tags

training_timetargetedgrey_box

Applications

federated self-supervised learningrepresentation learning

Read PDF arXiv DOI

HPE: Hallucinated Positive Entanglement for Backdoor Attacks in Federated Self-Supervised Learning

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Structure-Aware Distributed Backdoor Attacks in Federated Learning

On the Out-of-Distribution Backdoor Attack for Federated Learning

Mingling with the Good to Backdoor Federated Learning

ZORRO: Zero-Knowledge Robustness and Privacy for Split Learning (Full Version)

BAPFL: Exploring Backdoor Attacks Against Prototype-based Federated Learning

IPBA: Imperceptible Perturbation Backdoor Attack in Federated Self-Supervised Learning

FAROS: Robust Federated Learning with Adaptive Scaling against Backdoor Attacks

FLAT: Latent-Driven Arbitrary-Target Backdoor Attacks in Federated Learning