NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Authors

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng

UC Berkeley; Google Research; UC San Diego

Portals

Summary

This paper proposes a neural radiance field, a simple fully connected network (weights are ~5MB) trained to reproduce input views of a single scene using a rendering loss. The network directly maps from spatial location and viewing direction (5D input) to color and opacity (4D output), acting as the "volume" so we can use volume rendering to differentiably render new views.

Abstract

We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network, whose input is a single continuous 5D coordinate (spatial location $(x,y,z)$ and viewing direction $(\theta, \phi)$) and whose output is the volume density and view-dependent emitted radiance at that spatial location. We synthesize views by querying 5D coordinates along camera rays and use classic volume rendering techniques to project the output colors and densities into an image. Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses. We describe how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrate results that outperform prior work on neural rendering and view synthesis. View synthesis results are best viewed as videos, so we urge readers to view our supplementary video for convincing comparisons.

Contribution

An approach for representing continuous scenes with complex geometry and materials as 5D neural radiance fields, parameterized as basic MLP networks
A differentiable rendering procedure based on classical volume rendering techniques, which we use to optimize these representations from standard RGB images. This includes a hierarchical sampling strategy to allocate the MLP’s capacity towards space with visible scene content
A positional encoding to map each input 5D coordinate into a higher dimensional space, which enables us to successfully optimize neural radiance fields to represent high-frequency scene content

Related Works

Neural 3D shape representations; View synthesis and image-based rendering

Comparisons

LLFF, SRNl

Overview

We synthesize images by sampling 5D coordinates (location and viewing direction) along camera rays, feeding those locations into an MLP to produce a color and volume density, and using volume rendering techniques to composite these values into an image. This rendering function is differentiable, so we can optimize our scene representation by minimizing the residual between synthesized and ground truth observed images.

PDF Preview

2003.08934

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Authors

Portals

Summary

Abstract

Contribution

Related Works

Comparisons

Overview

PDF Preview

Like this:

Leave a Reply Cancel reply

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Authors

Portals

Summary

Abstract

Contribution

Related Works

Comparisons

Overview

PDF Preview

Like this:

You may also Like:

NeRF-Art: Text-Driven Neural Radiance Fields Stylization

ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields

One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization

Leave a Reply Cancel reply