Robust GNN Watermarking via Implicit Perception of Topological Invariants
Jipeng Li 1, Yannning Shen 2
Published on arXiv
2510.25934
Model Theft
OWASP ML Top 10 — ML05
Key Finding
InvGNN-WM matches clean task accuracy while achieving higher watermark accuracy than trigger-based and compression-based baselines, surviving pruning, fine-tuning, and quantization, with NP-complete exact removal.
InvGNN-WM
Novel technique introduced
Graph Neural Networks (GNNs) are valuable intellectual property, yet many watermarks rely on backdoor triggers that break under common model edits and create ownership ambiguity. We present InvGNN-WM, which ties ownership to a model's implicit perception of a graph invariant, enabling trigger-free, black-box verification with negligible task impact. A lightweight head predicts normalized algebraic connectivity on an owner-private carrier set; a sign-sensitive decoder outputs bits, and a calibrated threshold controls the false-positive rate. Across diverse node and graph classification datasets and backbones, InvGNN-WM matches clean accuracy while yielding higher watermark accuracy than trigger- and compression-based baselines. It remains strong under unstructured pruning, fine-tuning, and post-training quantization; plain knowledge distillation (KD) weakens the mark, while KD with a watermark loss (KD+WM) restores it. We provide guarantees for imperceptibility and robustness, and we prove that exact removal is NP-complete.
Key Contributions
- InvGNN-WM: trigger-free, black-box GNN watermarking scheme that ties ownership to the model's implicit perception of a topological invariant (normalized algebraic connectivity) on an owner-private carrier set
- Sign-sensitive decoder with calibrated threshold enabling controllable false-positive rates and bit-level ownership verification
- Theoretical guarantees of imperceptibility and robustness, plus NP-completeness proof for exact watermark removal
🛡️ Threat Analysis
InvGNN-WM embeds a watermark in the GNN model's behavior (via a lightweight head encoding algebraic connectivity) to verify ownership if the model is stolen or redistributed — classic model IP protection via model watermarking, not content provenance.