Forget and Explain: Transparent Verification of GNN Unlearning

Graph neural networks (GNNs) are increasingly used to model complex patterns in graph-structured data. However, enabling them to "forget" designated information remains challenging, especially under privacy regulations such as the GDPR. Existing unlearning methods largely optimize for efficiency and scalability, yet they offer little transparency, and the black-box nature of GNNs makes it difficult to verify whether forgetting has truly occurred. We propose an explainability-driven verifier for GNN unlearning that snapshots the model before and after deletion, using attribution shifts and localized structural changes (for example, graph edit distance) as transparent evidence. The verifier uses five explainability metrics: residual attribution, heatmap shift, explainability score deviation, graph edit distance, and a diagnostic graph rule shift. We evaluate two backbones (GCN, GAT) and four unlearning strategies (Retrain, GraphEditor, GNNDelete, IDEA) across five benchmarks (Cora, Citeseer, Pubmed, Coauthor-CS, Coauthor-Physics). Results show that Retrain and GNNDelete achieve near-complete forgetting, GraphEditor provides partial erasure, and IDEA leaves residual signals. These explanation deltas provide the primary, human-readable evidence of forgetting; we also report membership-inference ROC-AUC as a complementary, graph-wide privacy signal.

Key Contributions

Five explainability-based metrics (Residual Attribution, Heatmap Shift, Explainability Score Deviation, Graph Edit Distance, Graph Rule Shift) for transparent GNN unlearning verification
Unified verification pipeline combining GraphChef and ProxyGraph-inspired k-hop local proxies with attribution heatmaps to trace information flow before and after deletion
Empirical evaluation showing that XAI metrics detect residual memorization in methods (e.g., IDEA) that appear to pass standard accuracy and MIA tests

🛡️ Threat Analysis

Membership Inference Attack

Membership inference ROC-AUC is explicitly used as a privacy evaluation signal for GNN unlearning; the paper further shows that XAI metrics detect residual memorization (data still linkable to deleted samples) that MIA tests fail to surface, directly addressing whether an adversary could still determine membership after unlearning.