Efficient and Encrypted Inference using Binarized Neural Networks within In-Memory Computing Architectures

Binarized Neural Networks (BNNs) are a class of deep neural networks designed to utilize minimal computational resources, which drives their popularity across various applications. Recent studies highlight the potential of mapping BNN model parameters onto emerging non-volatile memory technologies, specifically using crossbar architectures, resulting in improved inference performance compared to traditional CMOS implementations. However, the common practice of protecting model parameters from theft attacks by storing them in an encrypted format and decrypting them at runtime introduces significant computational overhead, thus undermining the core principles of in-memory computing, which aim to integrate computation and storage. This paper presents a robust strategy for protecting BNN model parameters, particularly within in-memory computing frameworks. Our method utilizes a secret key derived from a physical unclonable function to transform model parameters prior to storage in the crossbar. Subsequently, the inference operations are performed on the encrypted weights, achieving a very special case of Fully Homomorphic Encryption (FHE) with minimal runtime overhead. Our analysis reveals that inference conducted without the secret key results in drastically diminished performance, with accuracy falling below 15%. These results validate the effectiveness of our protection strategy in securing BNNs within in-memory computing architectures while preserving computational efficiency.

Key Contributions

Formalization of a PUF-based secret key scheme to encrypt BNN weight parameters before storage in RRAM crossbars, enabling inference on encrypted weights without runtime decryption overhead
Three novel weight transformation techniques (building on prior inversion, swapping, and dummy weight methods) that degrade accuracy to below 15% for adversaries lacking the secret key
Demonstration of a special-case FHE scheme for BNNs on in-memory computing hardware that preserves computational efficiency while protecting model IP

🛡️ Threat Analysis

Model Theft

The paper explicitly targets model parameter theft attacks where an adversary reverse-engineers RRAM hardware to extract BNN weights. The defense encrypts model parameters using PUF-derived keys before storage, so inference without the correct key collapses accuracy below 15% — this is model IP protection against extraction attacks.

Details

Model Types

cnn

Threat Tags

white_boxinference_time

Applications

2025 0 cit.

Model Theft

60%