"Energon": Unveiling Transformers from GPU Power and Thermal Side-Channels

Transformers have become the backbone of many Machine Learning (ML) applications, including language translation, summarization, and computer vision. As these models are increasingly deployed in shared Graphics Processing Unit (GPU) environments via Machine Learning as a Service (MLaaS), concerns around their security grow. In particular, the risk of side-channel attacks that reveal architectural details without physical access remains underexplored, despite the high value of the proprietary models they target. This work to the best of our knowledge is the first to investigate GPU power and thermal fluctuations as side-channels and further exploit them to extract information from pre-trained transformer models. The proposed analysis shows how these side channels can be exploited at user-privilege to reveal critical architectural details such as encoder/decoder layer and attention head for both language and vision transformers. We demonstrate the practical impact by evaluating multiple language and vision pre-trained transformers which are publicly available. Through extensive experimental evaluations, we demonstrate that the attack model achieves a high accuracy of over 89% on average for model family identification and 100% for hyperparameter classification, in both single-process as well as noisy multi-process scenarios. Moreover, by leveraging the extracted architectural information, we demonstrate highly effective black-box transfer adversarial attacks with an average success rate exceeding 93%, underscoring the security risks posed by GPU side-channel leakage in deployed transformer models.

Key Contributions

First work to use GPU power and thermal fluctuations as side-channels to extract transformer architectural details (layer counts, attention heads, hyperparameters) at user-privilege in shared GPU/MLaaS environments
Achieves 89%+ model family identification accuracy and 100% hyperparameter classification in both single-process and noisy multi-process scenarios
Demonstrates practical downstream impact via black-box transfer adversarial attacks using extracted architecture, achieving 93%+ average success rate

🛡️ Threat Analysis

Input Manipulation Attack

Paper explicitly demonstrates that the extracted architectural information enables highly effective black-box transfer adversarial attacks with 93%+ average success rate, showing the full attack chain from architecture extraction to input manipulation.

Model Theft

Core contribution is extracting transformer architectural details — encoder/decoder layer counts, attention heads, hyperparameters — from proprietary models at user-privilege via GPU power/thermal side-channels in MLaaS; this is a model stealing attack targeting model IP without direct model access.