AegisRF: Adversarial Perturbations Guided with Sensitivity for Protecting Intellectual Property of Neural Radiance Fields

As Neural Radiance Fields (NeRFs) have emerged as a powerful tool for 3D scene representation and novel view synthesis, protecting their intellectual property (IP) from unauthorized use is becoming increasingly crucial. In this work, we aim to protect the IP of NeRFs by injecting adversarial perturbations that disrupt their unauthorized applications. However, perturbing the 3D geometry of NeRFs can easily deform the underlying scene structure and thus substantially degrade the rendering quality, which has led existing attempts to avoid geometric perturbations or restrict them to explicit spaces like meshes. To overcome this limitation, we introduce a learnable sensitivity to quantify the spatially varying impact of geometric perturbations on rendering quality. Building upon this, we propose AegisRF, a novel framework that consists of a Perturbation Field, which injects adversarial perturbations into the pre-rendering outputs (color and volume density) of NeRF models to fool an unauthorized downstream target model, and a Sensitivity Field, which learns the sensitivity to adaptively constrain geometric perturbations, preserving rendering quality while disrupting unauthorized use. Our experimental evaluations demonstrate the generalized applicability of AegisRF across diverse downstream tasks and modalities, including multi-view image classification and voxel-based 3D localization, while maintaining high visual fidelity. Codes are available at https://github.com/wkim97/AegisRF.

Key Contributions

Perturbation Field that injects adversarial perturbations into NeRF color and volume density outputs to disrupt unauthorized downstream ML models.
Sensitivity Field that learns spatially-varying geometric sensitivity to adaptively constrain perturbations, preserving legitimate rendering quality.
Demonstrated generalized IP protection across diverse modalities including multi-view image classification and voxel-based 3D localization.

🛡️ Threat Analysis

Input Manipulation Attack

Core mechanism is crafting adversarial perturbations injected into NeRF pre-rendering outputs (color and volume density) that cause misclassification and failure in unauthorized downstream models — this is adversarial example generation at its core, evaluated on image classifiers and 3D localization models.

Output Integrity Attack

The overarching goal is protecting NeRF output integrity via embedded adversarial protective perturbations — analogous to anti-deepfake and style-transfer protections. The cited Hönig et al. (ICLR 2025) specifically attacks such content protections, confirming AegisRF sits on the defense side of the ML09 content-protection battle.