FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning

Incremental unlearning (IU) is critical for pre-trained models to comply with sequential data deletion requests, yet existing methods primarily suppress parameters or confuse knowledge without explicit constraints on both feature and gradient level, resulting in \textit{superficial forgetting} where residual information remains recoverable. This incomplete forgetting risks security breaches and disrupts retention balance, especially in IU scenarios. We propose FG-OrIU (\textbf{F}eature-\textbf{G}radient \textbf{Or}thogonality for \textbf{I}ncremental \textbf{U}nlearning), the first framework unifying orthogonal constraints on both features and gradients level to achieve deep forgetting, where the forgetting effect is irreversible. FG-OrIU decomposes feature spaces via Singular Value Decomposition (SVD), separating forgetting and remaining class features into distinct subspaces. It then enforces dual constraints: feature orthogonal projection on both forgetting and remaining classes, while gradient orthogonal projection prevents the reintroduction of forgotten knowledge and disruption to remaining classes during updates. Additionally, dynamic subspace adaptation merges newly forgetting subspaces and contracts remaining subspaces, ensuring a stable balance between removal and retention across sequential unlearning tasks. Extensive experiments demonstrate the effectiveness of our method.

Key Contributions

FG-OrIU framework enforcing dual orthogonal constraints at both feature and gradient levels to achieve irreversible 'deep forgetting' rather than superficial feature degradation
SVD-based feature space decomposition that separates forgetting and remaining class subspaces with orthogonal projection constraints preventing knowledge re-entanglement
Dynamic subspace adaptation mechanism that merges newly forgotten subspaces and contracts remaining subspaces across sequential unlearning tasks

🛡️ Threat Analysis

Model Inversion Attack

The paper frames 'superficial forgetting' as a security risk where an adversary can reconstruct forgotten training data from residual features (demonstrated via Deep Image Prior reconstruction). FG-OrIU explicitly defends against this by making feature reconstruction produce only noise, directly targeting data reconstruction from model internals. The retraining-the-head experiment further demonstrates that residual information is adversarially exploitable, and the paper proposes unlearning as a defense against this recovery.