Statistical Roughness-Informed Machine Unlearning

Machine unlearning aims to remove the influence of a designated forget set from a trained model while preserving utility on the retained data. In modern deep networks, approximate unlearning frequently fails under large or adversarial deletions due to pronounced layer-wise heterogeneity: some layers exhibit stable, well-regularized representations while others are brittle, undertrained, or overfit, so naive update allocation can trigger catastrophic forgetting or unstable dynamics. We propose Statistical-Roughness Adaptive Gradient Unlearning (SRAGU), a mechanism-first unlearning algorithm that reallocates unlearning updates using layer-wise statistical roughness operationalized via heavy-tailed spectral diagnostics of layer weight matrices. Starting from an Adaptive Gradient Unlearning (AGU) sensitivity signal computed on the forget set, SRAGU estimates a WeightWatcher-style heavy-tailed exponent for each layer, maps it to a bounded spectral stability weight, and uses this stability signal to spectrally reweight the AGU sensitivities before applying the same minibatch update form. This concentrates unlearning motion in spectrally stable layers while damping updates in unstable or overfit layers, improving stability under hard deletions. We evaluate unlearning via behavioral alignment to a gold retrained reference model trained from scratch on the retained data, using empirical prediction-divergence and KL-to-gold proxies on a forget-focused query set; we additionally report membership inference auditing as a complementary leakage signal, treating forget-set points as should-be-forgotten members during evaluation.

Key Contributions

Identifies layer-wise heterogeneity as a concrete failure mode of sensitivity-driven unlearning algorithms
Proposes SRAGU, which reweights AGU unlearning updates using heavy-tailed spectral exponents (WeightWatcher-style) to concentrate updates in spectrally stable layers and dampen them in brittle/overfit layers
Evaluates unlearning fidelity via KL divergence to a gold retrained model and membership inference auditing as a privacy leakage proxy

🛡️ Threat Analysis

Membership Inference Attack

The paper explicitly reports membership inference auditing as a complementary leakage signal — treating forget-set points as 'should-be-forgotten members' — placing the evaluation squarely in an adversarial membership inference threat model. The introduction also frames unlearning's importance in terms of preventing membership inference and training data extraction.

Details

Model Types

cnntransformer

Threat Tags

training_time

Applications

2026 0 cit.

Membership Inference Attack

55%

Statistical Roughness-Informed Machine Unlearning

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Decoupling Generalizability and Membership Privacy Risks in Neural Networks

Active Membership Inference Test (aMINT): Enhancing Model Auditability with Multi-Task Learning

Sequential Subspace Noise Injection Prevents Accuracy Collapse in Certified Unlearning

Toward Reliable Machine Unlearning: Theory, Algorithms, and Evaluation

LoRA and Privacy: When Random Projections Help (and When They Don't)

Learnability and Privacy Vulnerability are Entangled in a Few Critical Weights

Shadow Unlearning: A Neuro-Semantic Approach to Fidelity-Preserving Faceless Forgetting in LLMs

Statistical MIA: Rethinking Membership Inference Attack for Reliable Unlearning Auditing