Forget and Explain: Transparent Verification of GNN Unlearning
Imran Ahsan , Hyunwook Yu , Jinsung Kim , Mucheol Kim
Published on arXiv
2512.07450
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
XAI-based attribution metrics reveal residual memorization in GNN unlearning methods (especially IDEA) that accuracy and membership inference tests both overlook.
Explainability-Driven GNN Unlearning Verifier
Novel technique introduced
Graph neural networks (GNNs) are increasingly used to model complex patterns in graph-structured data. However, enabling them to "forget" designated information remains challenging, especially under privacy regulations such as the GDPR. Existing unlearning methods largely optimize for efficiency and scalability, yet they offer little transparency, and the black-box nature of GNNs makes it difficult to verify whether forgetting has truly occurred. We propose an explainability-driven verifier for GNN unlearning that snapshots the model before and after deletion, using attribution shifts and localized structural changes (for example, graph edit distance) as transparent evidence. The verifier uses five explainability metrics: residual attribution, heatmap shift, explainability score deviation, graph edit distance, and a diagnostic graph rule shift. We evaluate two backbones (GCN, GAT) and four unlearning strategies (Retrain, GraphEditor, GNNDelete, IDEA) across five benchmarks (Cora, Citeseer, Pubmed, Coauthor-CS, Coauthor-Physics). Results show that Retrain and GNNDelete achieve near-complete forgetting, GraphEditor provides partial erasure, and IDEA leaves residual signals. These explanation deltas provide the primary, human-readable evidence of forgetting; we also report membership-inference ROC-AUC as a complementary, graph-wide privacy signal.
Key Contributions
- Five explainability-based metrics (Residual Attribution, Heatmap Shift, Explainability Score Deviation, Graph Edit Distance, Graph Rule Shift) for transparent GNN unlearning verification
- Unified verification pipeline combining GraphChef and ProxyGraph-inspired k-hop local proxies with attribution heatmaps to trace information flow before and after deletion
- Empirical evaluation showing that XAI metrics detect residual memorization in methods (e.g., IDEA) that appear to pass standard accuracy and MIA tests
🛡️ Threat Analysis
Membership inference ROC-AUC is explicitly used as a privacy evaluation signal for GNN unlearning; the paper further shows that XAI metrics detect residual memorization (data still linkable to deleted samples) that MIA tests fail to surface, directly addressing whether an adversary could still determine membership after unlearning.