Hide and Find: A Distributed Adversarial Attack on Federated Graph Learning

Federated Graph Learning (FedGL) is vulnerable to malicious attacks, yet developing a truly effective and stealthy attack method remains a significant challenge. Existing attack methods suffer from low attack success rates, high computational costs, and are easily identified and smoothed by defense algorithms. To address these challenges, we propose \textbf{FedShift}, a novel two-stage "Hide and Find" distributed adversarial attack. In the first stage, before FedGL begins, we inject a learnable and hidden "shifter" into part of the training data, which subtly pushes poisoned graph representations toward a target class's decision boundary without crossing it, ensuring attack stealthiness during training. In the second stage, after FedGL is complete, we leverage the global model information and use the hidden shifter as an optimization starting point to efficiently find the adversarial perturbations. During the final attack, we aggregate these perturbations from multiple malicious clients to form the final effective adversarial sample and trigger the attack. Extensive experiments on six large-scale datasets demonstrate that our method achieves the highest attack effectiveness compared to existing advanced attack methods. In particular, our attack can effectively evade 3 mainstream robust federated learning defense algorithms and converges with a time cost reduction of over 90\%, highlighting its exceptional stealthiness, robustness, and efficiency.

Key Contributions

FedShift: a two-stage 'Hide and Find' distributed adversarial attack on Federated Graph Learning that injects a covert learnable shifter into training data (Stage 1) to set up efficient adversarial perturbation search after training (Stage 2)
Stealthy data poisoning via a shifter that nudges graph representations toward the target class boundary without crossing it, evading 3 mainstream robust FL defense algorithms (e.g., FedAvg-based Byzantine-tolerant schemes)
Distributed adversarial perturbation aggregation across multiple malicious clients that achieves the highest attack success rate on six large-scale graph datasets with over 90% reduction in convergence time cost versus prior methods

🛡️ Threat Analysis

Input Manipulation Attack

Stage 2 of FedShift is an adversarial input manipulation attack: after FL training, malicious clients use the global model and the pre-injected shifter to find adversarial perturbations on graph inputs, then aggregate them across clients to cause misclassification at inference time — a core adversarial example/evasion attack.

Data Poisoning Attack

Stage 1 of FedShift poisons training data by injecting a learnable hidden 'shifter' into a subset of clients' training graphs, covertly shifting graph representations toward the target class's decision boundary to prepare the model for Stage 2 exploitation — the attack vector is the training data itself.