Stealing AI Model Weights Through Covert Communication Channels

AI models are often regarded as valuable intellectual property due to the high cost of their development, the competitive advantage they provide, and the proprietary techniques involved in their creation. As a result, AI model stealing attacks pose a serious concern for AI model providers. In this work, we present a novel attack targeting wireless devices equipped with AI hardware accelerators. The attack unfolds in two phases. In the first phase, the victim's device is compromised with a hardware Trojan (HT) designed to covertly leak model weights through a hidden communication channel, without the victim realizing it. In the second phase, the adversary uses a nearby wireless device to intercept the victim's transmission frames during normal operation and incrementally reconstruct the complete weight matrix. The proposed attack is agnostic to both the AI model architecture and the hardware accelerator used. We validate our approach through a hardware-based demonstration involving four diverse AI models of varying types and sizes. We detail the design of the HT and the covert channel, highlighting their stealthy nature. Additionally, we analyze the impact of bit error rates on the reception and propose an error mitigation technique. The effectiveness of the attack is evaluated based on the accuracy of the reconstructed models with stolen weights and the time required to extract them. Finally, we explore potential defense mechanisms.

Key Contributions

Two-phase attack combining hardware Trojan implantation with wireless frame interception to incrementally reconstruct complete AI model weight matrices
Architecture-agnostic covert channel design that embeds model weights into normal wireless transmission frames without alerting the victim
Bit error rate analysis and error mitigation technique for reliable weight reconstruction over a wireless covert channel

🛡️ Threat Analysis

Model Theft

Primary contribution is stealing AI model weights (intellectual property) using a hardware Trojan as a covert exfiltration mechanism — this is model theft via a hardware-based side/covert channel, matching the 'side-channel attacks to extract model parameters' criterion under ML05.

Details

Model Types

cnntransformer

Threat Tags

inference_timetargetedphysical

Applications

2026 0 cit.

Model Theft

46%

Stealing AI Model Weights Through Covert Communication Channels

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Double Strike: Breaking Approximation-Based Side-Channel Countermeasures for DNNs

Is the Hard-Label Cryptanalytic Model Extraction Really Polynomial?

Delving into Cryptanalytic Extraction of PReLU Neural Networks

Kraken: Higher-order EM Side-Channel Attacks on DNNs in Near and Far Field

A PUF-Based Approach for Copy Protection of Intellectual Property in Neural Network Models

StolenLoRA: Exploring LoRA Extraction Attacks via Synthetic Data

Data Augmentation Techniques to Reverse-Engineer Neural Network Weights from Input-Output Queries

SPOILER: TEE-Shielded DNN Partitioning of On-Device Secure Inference with Poison Learning