Tighter Truncated Rectangular Prism Approximation for RNN Robustness Verification

Robustness verification is a promising technique for rigorously proving Recurrent Neural Networks (RNNs) robustly. A key challenge is to over-approximate the nonlinear activation functions with linear constraints, which can transform the verification problem into an efficiently solvable linear programming problem. Existing methods over-approximate the nonlinear parts with linear bounding planes individually, which may cause significant over-estimation and lead to lower verification accuracy. In this paper, in order to tightly enclose the three-dimensional nonlinear surface generated by the Hadamard product, we propose a novel truncated rectangular prism formed by two linear relaxation planes and a refinement-driven method to minimize both its volume and surface area for tighter over-approximation. Based on this approximation, we implement a prototype DeepPrism for RNN robustness verification. The experimental results demonstrate that \emph{DeepPrism} has significant improvement compared with the state-of-the-art approaches in various tasks of image classification, speech recognition and sentiment analysis.

Key Contributions

Novel truncated rectangular prism abstraction formed by two linear relaxation planes to tightly over-approximate the 3D nonlinear surface of Hadamard-product gated functions in LSTMs
Refinement-driven optimization using a hybrid objective combining volume and surface area of the prism to minimize over-approximation error
DeepPrism verifier outperforming SOTA RNN robustness verification baselines across image classification, speech recognition, and sentiment analysis tasks

🛡️ Threat Analysis

Input Manipulation Attack

Proposes a certified defense (formal robustness verification) against adversarial input perturbations for RNNs — provides provable guarantees that model outputs are invariant within an ε-perturbation ball, directly addressing the inference-time adversarial example threat.

Details

Domains

visionnlpaudio

Model Types

rnn

Threat Tags

white_boxinference_timeuntargeted

Applications

2026 0 cit.

Input Manipulation Attack

56%