Authors
Lior Yariv, Peter Hedman, Christian Reiser, Dor Verbin, Pratul P. Srinivasan, Richard Szeliski, Jonathan T. Barron, Ben Mildenhall
Google Research; University of Tübingen; Weizmann Institute
Portals
Abstract
We present a method for reconstructing high-quality meshes of large unbounded real-world scenes suitable for photorealistic novel view synthesis. We first optimize a hybrid neural volume-surface scene representation designed to have well-behaved level sets that correspond to surfaces in the scene. We then bake this representation into a high-quality triangle mesh, which we equip with a simple and fast view-dependent appearance model based on spherical Gaussians. Finally, we optimize this baked representation to best reproduce the captured viewpoints, resulting in a model that can leverage accelerated polygon rasterization pipelines for real-time view synthesis on commodity hardware. Our approach outperforms previous scene representations for real-time rendering in terms of accuracy, speed, and power consumption, and produces high quality meshes that enable applications such as appearance editing and physical simulation.
Contribution
- High-quality neural surface reconstruction of unbounded real-world scenes
- a framework for real-time rendering of these scenes in a browser, and
- we demonstrate that spherical Gaussians are a practical representation of view-dependence appearance for view-synthesis
Related Works
View synthesis
Comparisons
Mobile-NeRF, Deep Blending, NeRF, NeRF++, Stable View Synthesis, Mip-NeRF 360, Instant-NGP
Overview
Our method is composed of three stages, which are visualized in Figure 2. First we optimize a surface-based representation of the geometry and appearance of a scene using NeRF-like volume rendering. Then, we “bake” that geometry into a mesh, which we show is accurate enough to support convincing appearance editing and physics simulation. Finally, we train a new appearance model that uses spherical Gaussians (SGs) embedded within each vertex of the mesh, which replaces the expensive NeRF-like appearance model from the first step. The resulting 3D representation that results from this approach can be rendered in real-time on commodity devices, as rendering simply requires rasterizing a mesh and querying a small number of spherical Gaussians.