OSNIP: Breaking the Privacy-Utility-Efficiency Trilemma in LLM Inference via Obfuscated Semantic Null Space

We propose Obfuscated Semantic Null space Injection for Privacy (OSNIP), a lightweight client-side encryption framework for privacy-preserving LLM inference. Generalizing the geometric intuition of linear kernels to the high-dimensional latent space of LLMs, we formally define the ``Obfuscated Semantic Null Space'', a high-dimensional regime that preserves semantic fidelity while enforcing near-orthogonality to the original embedding. By injecting perturbations that project the original embedding into this space, OSNIP ensures privacy without any post-processing. Furthermore, OSNIP employs a key-dependent stochastic mapping that synthesizes individualized perturbation trajectories unique to each user. Evaluations on 12 generative and classification benchmarks show that OSNIP achieves state-of-the-art performance, sharply reducing attack success rates while maintaining strong model utility under strict security constraints.

Key Contributions

Formal definition of the 'Obfuscated Semantic Null Space' — a high-dimensional embedding regime that is geometrically near-orthogonal to the original input yet semantically equivalent to the LLM, enabling perturbation-based obfuscation without post-processing
Key-dependent stochastic perturbation mapping that synthesizes individualized obfuscation trajectories per user, preventing correlation attacks across users
Lightweight client-side framework evaluated on 12 generative and classification benchmarks, achieving state-of-the-art attack success rate reduction while maintaining model utility

🛡️ Threat Analysis

Model Inversion Attack

The primary threat model is an adversary (cloud server or eavesdropper) performing embedding inversion — reconstructing the user's original private text from the obfuscated embeddings sent during LLM inference. OSNIP defends against this by injecting perturbations into a near-orthogonal 'Semantic Null Space', and the paper evaluates attack success rates against this reconstruction threat.

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

inference_timeblack_box

Datasets

12 generative and classification benchmarks (unnamed in abstract/partial body)

Applications

2025 0 cit.

Model Inversion Attack

93%