defense 2025

Amulet: Fast TEE-Shielded Inference for On-Device Model Protection

Zikai Mao ¹, Lingchen Zhao ¹, Lei Xu ², Wentao Dong ³, Shenyi Zhang ¹, Cong Wang ³, Qian Wang ¹

¹ Wuhan University

² Nanjing University of Science and Technology

³ City University of Hong Kong

0 citations · 38 references · arXiv

Published on arXiv

2512.07495

Model Theft

OWASP ML Top 10 — ML05

Key Finding

Achieves 8-9x speedup over full-TEE baselines and 2.2x over state-of-the-art obfuscation methods, with only 2.8-4.8x overhead versus unprotected inference and negligible accuracy loss

Amulet

Novel technique introduced

On-device machine learning (ML) introduces new security concerns about model privacy. Storing valuable trained ML models on user devices exposes them to potential extraction by adversaries. The current mainstream solution for on-device model protection is storing the weights and conducting inference within Trusted Execution Environments (TEEs). However, due to limited trusted memory that cannot accommodate the whole model, most existing approaches employ a partitioning strategy, dividing a model into multiple slices that are loaded into the TEE sequentially. This frequent interaction between untrusted and trusted worlds dramatically increases inference latency, sometimes by orders of magnitude. In this paper, we propose Amulet, a fast TEE-shielded on-device inference framework for ML model protection. Amulet incorporates a suite of obfuscation methods specifically designed for common neural network architectures. After obfuscation by the TEE, the entire transformed model can be securely stored in untrusted memory, allowing the inference process to execute directly in untrusted memory with GPU acceleration. For each inference request, only two rounds of minimal-overhead interaction between untrusted and trusted memory are required to process input samples and output results. We also provide theoretical proof from an information-theoretic perspective that the obfuscated model does not leak information about the original weights. We comprehensively evaluated Amulet using diverse model architectures ranging from ResNet-18 to GPT-2. Our approach incurs inference latency only 2.8-4.8x that of unprotected models with negligible accuracy loss, achieving an 8-9x speedup over baseline methods that execute inference entirely within TEEs, and performing approximately 2.2x faster than the state-of-the-art obfuscation-based method.

Key Contributions

TEE-based obfuscation suite for convolutional layers, attention blocks, and non-linear activations, enabling the full transformed model to reside in untrusted memory safely
Architecture requiring only two TEE interaction rounds per inference request, enabling GPU acceleration in untrusted memory
Information-theoretic proof that the obfuscated model reveals zero information about original weights

🛡️ Threat Analysis

Model Theft

The explicit threat model is an adversary extracting valuable trained model weights from untrusted device memory. Amulet defends against this by obfuscating model weights in the TEE so they can be stored in untrusted memory without leaking the original weights — a direct defense against model IP theft.

Details

Domains

visionnlp

Model Types

cnntransformer

Threat Tags

white_boxinference_time

Datasets

ResNet-18GPT-2

Applications

on-device ml inferencemodel ip protection

Read PDF arXiv DOI

Amulet: Fast TEE-Shielded Inference for On-Device Model Protection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks

SPOILER: TEE-Shielded DNN Partitioning of On-Device Secure Inference with Poison Learning

Model Unmerging: Making Your Models Unmergeable for Secure Model Sharing

Rotation, Scale, and Translation Resilient Black-box Fingerprinting for Intellectual Property Protection of EaaS Models

Making Models Unmergeable via Scaling-Sensitive Loss Landscape

Defending Unauthorized Model Merging via Dual-Stage Weight Protection

Train to Defend: First Defense Against Cryptanalytic Neural Network Parameter Extraction Attacks

DeepTracer: Tracing Stolen Model via Deep Coupled Watermarks