Privacy-Preserving Explainable AIoT Application via SHAP Entropy Regularization

The widespread integration of Artificial Intelligence of Things (AIoT) in smart home environments has amplified the demand for transparent and interpretable machine learning models. To foster user trust and comply with emerging regulatory frameworks, the Explainable AI (XAI) methods, particularly post-hoc techniques such as SHapley Additive exPlanations (SHAP), and Local Interpretable Model-Agnostic Explanations (LIME), are widely employed to elucidate model behavior. However, recent studies have shown that these explanation methods can inadvertently expose sensitive user attributes and behavioral patterns, thereby introducing new privacy risks. To address these concerns, we propose a novel privacy-preserving approach based on SHAP entropy regularization to mitigate privacy leakage in explainable AIoT applications. Our method incorporates an entropy-based regularization objective that penalizes low-entropy SHAP attribution distributions during training, promoting a more uniform spread of feature contributions. To evaluate the effectiveness of our approach, we developed a suite of SHAP-based privacy attacks that strategically leverage model explanation outputs to infer sensitive information. We validate our method through comparative evaluations using these attacks alongside utility metrics on benchmark smart home energy consumption datasets. Experimental results demonstrate that SHAP entropy regularization substantially reduces privacy leakage compared to baseline models, while maintaining high predictive accuracy and faithful explanation fidelity. This work contributes to the development of privacy-preserving explainable AI techniques for secure and trustworthy AIoT applications.

Key Contributions

SHAP entropy regularization: a training-time objective that penalizes low-entropy (concentrated) SHAP attribution distributions to reduce privacy leakage in model explanations
A suite of five SHAP-based privacy attacks (entropy attack, membership similarity attack, divergence attack, rank correlation attack, rank consistency attack) for evaluating explanation privacy
SHAP entropy-regularized LSTM for smart home energy forecasting, empirically outperforming baseline LSTM and DP-LSTM in privacy protection while maintaining predictive accuracy

🛡️ Threat Analysis

Model Inversion Attack

The suite of SHAP-based privacy attacks (divergence attack, rank correlation attack, entropy attack) target inference of sensitive behavioral attributes (occupancy patterns, routines, appliance usage) from model explanation outputs — a form of model inversion exploiting XAI interfaces rather than raw model outputs.

Membership Inference Attack

Paper explicitly develops membership inference attacks (membership similarity attack) that exploit SHAP explanation outputs to determine whether data points were in the training set; the defense (SHAP entropy regularization) is evaluated against these attacks.