α

Published on arXiv

2508.18971

Model Inversion Attack

OWASP ML Top 10 — ML03

Key Finding

ppNeSF reduces recoverable scene detail (average caption similarity 0.40 vs 0.68, FID 322 vs 250 on 7-Scenes) compared to a NeRF baseline without RGB head, while maintaining competitive localization accuracy.

ppNeSF (Privacy-Preserving Neural Segmentation Field)

Novel technique introduced


Visual localization (VL) is the task of estimating the camera pose in a known scene. VL methods, a.o., can be distinguished based on how they represent the scene, e.g., explicitly through a (sparse) point cloud or a collection of images or implicitly through the weights of a neural network. Recently, NeRF-based methods have become popular for VL. While NeRFs offer high-quality novel view synthesis, they inadvertently encode fine scene details, raising privacy concerns when deployed in cloud-based localization services as sensitive information could be recovered. In this paper, we tackle this challenge on two ends. We first propose a new protocol to assess privacy-preservation of NeRF-based representations. We show that NeRFs trained with photometric losses store fine-grained details in their geometry representations, making them vulnerable to privacy attacks, even if the head that predicts colors is removed. Second, we propose ppNeSF (Privacy-Preserving Neural Segmentation Field), a NeRF variant trained with segmentation supervision instead of RGB images. These segmentation labels are learned in a self-supervised manner, ensuring they are coarse enough to obscure identifiable scene details while remaining discriminativeness in 3D. The segmentation space of ppNeSF can be used for accurate visual localization, yielding state-of-the-art results.


Key Contributions

  • Inversion attack demonstrating that NeRFs trained with photometric loss expose private scene texture details through geometry representations, even after the color-prediction head is removed
  • VLM-based privacy evaluation protocol using LLaVa to assess fine-grained scene detail recoverability — more comprehensive than closed-set object detectors
  • ppNeSF: a privacy-preserving NeRF variant trained with self-supervised segmentation labels that achieves state-of-the-art visual localization without encoding identifiable scene appearance

🛡️ Threat Analysis

Model Inversion Attack

The primary attack inverts rendered internal representations of trained NeRFs to reconstruct private training-time scene images — an adversary recovers identifiable scene details from model internals, even after removing the color prediction head. ppNeSF is explicitly designed and evaluated as a defense against this data reconstruction threat.


Details

Domains
vision
Model Types
vlm
Threat Tags
white_boxinference_time
Datasets
7-Scenesmip-NeRF 360
Applications
visual localizationcloud-based localization servicesautonomous driving