GraphToxin: Reconstructing Full Unlearned Graphs from Graph Unlearning

Graph unlearning has emerged as a promising solution to comply with "the right to be forgotten" regulations by enabling the removal of sensitive information upon request. However, this solution is not foolproof. The involvement of multiple parties creates new attack surfaces, and residual traces of deleted data can still remain in the unlearned graph neural networks (GNNs). These vulnerabilities can be exploited by attackers to recover the supposedly erased samples, thereby undermining the intended functionality of graph unlearning. In this work, we propose GraphToxin, the first full graph reconstruction attack against graph unlearning. Specifically, we introduce a novel curvature matching module to provide fine-grained guidance for unlearned graph recovery. We demonstrate that GraphToxin can successfully subvert the regulatory guarantees expected from graph unlearning, it can recover not only a deleted individual's information and personal links but also sensitive content from their connections, thereby posing substantially more detrimental threats. Furthermore, we extend GraphToxin to multiple-node removal under both white-box and black-box settings, showcasing its practical feasibility and potential to cause considerable harm. We highlight the necessity of worst-case analysis and propose a systematic evaluation framework to assess attack performance under both random and worst-case node removal scenarios. Our extensive experiments demonstrate the effectiveness and flexibility of GraphToxin. Notably, existing defense mechanisms are largely ineffective against this attack or even amplify its performance in some cases. Given the severe privacy risks posed by GraphToxin, our work underscores the urgent need for more effective and robust defenses.

Key Contributions

First full graph reconstruction attack against graph unlearning, recovering deleted nodes, personal links, and sensitive data from connections
Novel curvature matching module providing fine-grained structural guidance for unlearned graph recovery
Systematic worst-case evaluation framework assessing attack performance under both random and adversarial node removal scenarios; demonstrates existing defenses are ineffective or counterproductive

🛡️ Threat Analysis

Model Inversion Attack

GraphToxin is a training data reconstruction attack: an adversary exploits residual traces in unlearned GNNs to recover supposedly deleted nodes, personal links, and sensitive graph data. This satisfies the ML03 adversary test — an attacker is actively reconstructing private training data from a model. The paper explicitly attacks graph unlearning methods by showing that erased data can be recovered, which per the machine unlearning decision tree maps directly to ML03.