survey 2025

A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives

0 citations

Published on arXiv

2508.15031

Model Theft

OWASP ML Top 10 — ML05

Key Finding

Identifies the utility-security trade-off as the central unsolved challenge in defending against model extraction, and proposes a novel taxonomy spanning attack mechanisms, defenses, and computing paradigms.

Machine learning (ML) models have significantly grown in complexity and utility, driving advances across multiple domains. However, substantial computational resources and specialized expertise have historically restricted their wide adoption. Machine-Learning-as-a-Service (MLaaS) platforms have addressed these barriers by providing scalable, convenient, and affordable access to sophisticated ML models through user-friendly APIs. While this accessibility promotes widespread use of advanced ML capabilities, it also introduces vulnerabilities exploited through Model Extraction Attacks (MEAs). Recent studies have demonstrated that adversaries can systematically replicate a target model's functionality by interacting with publicly exposed interfaces, posing threats to intellectual property, privacy, and system security. In this paper, we offer a comprehensive survey of MEAs and corresponding defense strategies. We propose a novel taxonomy that classifies MEAs according to attack mechanisms, defense approaches, and computing environments. Our analysis covers various attack techniques, evaluates their effectiveness, and highlights challenges faced by existing defenses, particularly the critical trade-off between preserving model utility and ensuring security. We further assess MEAs within different computing paradigms and discuss their technical, ethical, legal, and societal implications, along with promising directions for future research. This systematic survey aims to serve as a valuable reference for researchers, practitioners, and policymakers engaged in AI security and privacy. Additionally, we maintain an online repository continuously updated with related literature at https://github.com/kzhao5/ModelExtractionPapers.

Key Contributions

Comprehensive taxonomy classifying MEAs by attack mechanisms, defense approaches, and computing environments (including MLaaS, federated, and edge paradigms)
Systematic analysis of attack-defense trade-offs, particularly the tension between model utility preservation and security
Discussion of technical, ethical, legal, and societal implications of MEAs, plus a continuously updated online literature repository

🛡️ Threat Analysis

Model Theft

The paper's entire focus is on Model Extraction Attacks — adversaries querying ML APIs to clone model functionality — which is the canonical model theft threat. Defenses surveyed include watermarking for ownership verification and anti-extraction techniques, all squarely within ML05.

Details

Domains

visionnlptabular

Model Types

cnntransformerllmtraditional_ml

Threat Tags

black_boxinference_time

Applications

mlaas apisimage classificationnatural language processinghealthcare aifinancial ai

Read PDF arXiv Code

A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks

StolenLoRA: Exploring LoRA Extraction Attacks via Synthetic Data

LiteGuard: Efficient Task-Agnostic Model Fingerprinting with Enhanced Generalization

Application-Specific Power Side-Channel Attacks and Countermeasures: A Survey

Rotation, Scale, and Translation Resilient Black-box Fingerprinting for Intellectual Property Protection of EaaS Models

Kraken: Higher-order EM Side-Channel Attacks on DNNs in Near and Far Field

Amulet: Fast TEE-Shielded Inference for On-Device Model Protection

PAC-Private Responses with Adversarial Composition