survey 2025

Intellectual Property in Graph-Based Machine Learning as a Service: Attacks and Defenses

0 citations

Published on arXiv

2508.19641

Model Theft

OWASP ML Top 10 — ML05

Model Inversion Attack

OWASP ML Top 10 — ML03

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

Presents the first taxonomy and open-source evaluation library unifying model extraction, membership inference, and training data reconstruction threats and defenses for graph ML-as-a-service

PyGIP

Novel technique introduced

Graph-structured data, which captures non-Euclidean relationships and interactions between entities, is growing in scale and complexity. As a result, training state-of-the-art graph machine learning (GML) models have become increasingly resource-intensive, turning these models and data into invaluable Intellectual Property (IP). To address the resource-intensive nature of model training, graph-based Machine-Learning-as-a-Service (GMLaaS) has emerged as an efficient solution by leveraging third-party cloud services for model development and management. However, deploying such models in GMLaaS also exposes them to potential threats from attackers. Specifically, while the APIs within a GMLaaS system provide interfaces for users to query the model and receive outputs, they also allow attackers to exploit and steal model functionalities or sensitive training data, posing severe threats to the safety of these GML models and the underlying graph data. To address these challenges, this survey systematically introduces the first taxonomy of threats and defenses at the level of both GML model and graph-structured data. Such a tailored taxonomy facilitates an in-depth understanding of GML IP protection. Furthermore, we present a systematic evaluation framework to assess the effectiveness of IP protection methods, introduce a curated set of benchmark datasets across various domains, and discuss their application scopes and future challenges. Finally, we establish an open-sourced versatile library named PyGIP, which evaluates various attack and defense techniques in GMLaaS scenarios and facilitates the implementation of existing benchmark methods. The library resource can be accessed at: https://labrai.github.io/PyGIP. We believe this survey will play a fundamental role in intellectual property protection for GML and provide practical recipes for the GML community.

Key Contributions

First comprehensive taxonomy of IP protection threats and defenses for graph ML at both model-level (extraction, watermarking) and data-level (membership inference, reconstruction) in GMLaaS
Systematic evaluation framework and curated benchmark datasets for assessing GML IP protection methods across application domains
Open-source PyGIP library implementing a wide range of attack and defense techniques for GMLaaS scenarios

🛡️ Threat Analysis

Model Inversion Attack

Explicitly covers attacks that recover sensitive information from training graph data via model outputs, mapping to data reconstruction and model inversion threats for graph-structured training data.

Membership Inference Attack

Survey covers membership inference attacks on graph data — determining whether specific nodes, edges, or subgraphs were present in the training graph, a well-established threat in GML privacy literature.

Model Theft

Primary focus is protecting GML model IP from extraction and stealing attacks via API queries in GMLaaS — directly covers model theft attacks, model watermarking defenses, and ownership verification for graph neural networks.

Details

Domains

graph

Model Types

gnn

Threat Tags

black_boxinference_time

Applications

graph machine learning as a servicenode classificationlink predictionrecommendation systemsknowledge graphs

Read PDF arXiv Code

Intellectual Property in Graph-Based Machine Learning as a Service: Attacks and Defenses

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

AttackPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents

Inference Attacks Against Graph Generative Diffusion Models

Robust GNN Watermarking via Implicit Perception of Topological Invariants

CITED: A Decision Boundary-Aware Signature for GNNs Towards Model Extraction Defense

On Stealing Graph Neural Network Models

NLP Privacy Risk Identification in Social Media (NLP-PRISM): A Survey

Privacy Auditing of Multi-domain Graph Pre-trained Model under Membership Inference Attacks

GraphToxin: Reconstructing Full Unlearned Graphs from Graph Unlearning