Private and interpretable clinical prediction with quantum-inspired tensor train models

Machine learning in clinical settings must balance predictive accuracy, interpretability, and privacy. Models such as logistic regression (LR) offer transparency, while neural networks (NNs) provide greater predictive power; yet both remain vulnerable to privacy attacks. We empirically assess these risks by designing attacks that identify which public datasets were used to train a model under varying levels of adversarial access, applying them to LORIS, a publicly available LR model for immunotherapy response prediction, as well as to additional shallow NN models trained for the same task. Our results show that both models leak significant training-set information, with LRs proving particularly vulnerable in white-box scenarios. Moreover, we observe that common practices such as cross-validation in LRs exacerbate these risks. To mitigate these vulnerabilities, we propose a quantum-inspired defense based on tensorizing discretized models into tensor trains (TTs), which fully obfuscates parameters while preserving accuracy, reducing white-box attacks to random guessing and degrading black-box attacks comparably to Differential Privacy. TT models retain LR interpretability and extend it through efficient computation of marginal and conditional distributions, while also enabling this higher level of interpretability for NNs. Our results demonstrate that tensorization is widely applicable and establishes a practical foundation for private, interpretable, and effective clinical prediction.

Key Contributions

Shadow-model membership inference attack identifying which public datasets were in training sets for both LR (LORIS) and shallow NN clinical models, demonstrating that cross-validation in LRs severely amplifies leakage
Quantum-inspired tensor train (TT) tensorization defense that fully obfuscates model parameters, reducing white-box MIA to random guessing and matching Differential Privacy on black-box attacks while preserving accuracy
Demonstration that TT models preserve LR interpretability (monotonicity, marginals, conditionals) and extend it to NNs, offering a post-training privacy-interpretability solution for clinical settings

🛡️ Threat Analysis

Membership Inference Attack

Paper's primary contribution is a membership inference attack (identifying which public datasets were used to train LORIS and NN models under white-box and black-box access) and a tensor train defense that degrades MIA to random guessing — directly targeting the membership inference threat model.