DRGW: Learning Disentangled Representations for Robust Graph Watermarking

Graph-structured data is foundational to numerous web applications, and watermarking is crucial for protecting their intellectual property and ensuring data provenance. Existing watermarking methods primarily operate on graph structures or entangled graph representations, which compromise the transparency and robustness of watermarks due to the information coupling in representing graphs and uncontrollable discretization in transforming continuous numerical representations into graph structures. This motivates us to propose DRGW, the first graph watermarking framework that addresses these issues through disentangled representation learning. Specifically, we design an adversarially trained encoder that learns an invariant structural representation against diverse perturbations and derives a statistically independent watermark carrier, ensuring both robustness and transparency of watermarks. Meanwhile, we devise a graph-aware invertible neural network to provide a lossless channel for watermark embedding and extraction, guaranteeing high detectability and transparency of watermarks. Additionally, we develop a structure-aware editor that resolves the issue of latent modifications into discrete graph edits, ensuring robustness against structural perturbations. Experiments on diverse benchmark datasets demonstrate the superior effectiveness of DRGW.

Key Contributions

First graph watermarking framework using disentangled representation learning to decouple structural information from the watermark carrier, ensuring both transparency and robustness
Adversarially trained encoder that learns a perturbation-invariant structural representation with a statistically independent watermark carrier, paired with a graph-aware invertible neural network for lossless embedding and extraction
Structure-aware editor that translates continuous latent modifications into discrete graph edits, resolving the discretization degradation problem that undermines prior latent-space methods

🛡️ Threat Analysis

Output Integrity Attack

Embeds watermarks IN GRAPH DATA CONTENT to protect intellectual property and enable provenance tracking — this is content/data watermarking analogous to text or image watermarking, not model-weight watermarking. The framework defends against watermark removal via structural perturbations, directly addressing output/content integrity.