Neural Radiance Field Reconstruction with Depth and Normal Constraints

Published:

Abstract

This thesis addresses the problem of novel view synthesis from sparse-view supervision in the field of computer vision. Neural radiance fields (NeRFs) are a popular approach for this problem, but they rely heavily on a large dataset of images and precisely calibrated cameras. Motivated by recent advances in the area of monocular geometry prediction, which allow for cheap generation of depth- and normal maps, we systematically explore methods to incorporate these cues for the supervision of NeRFs. Our proposed method bounds the weights accumulated along rays using a Gaussian cumulative density function about the predicted depth. These bounds are directly derived from a Gaussian assumption on the likelihood of a ray being absorbed on its way through a neural volume. We show that our method, contrary to prior work, consistently improves reconstruction results for any number of training views, with photorealistic reconstructions being feasible with as few as three views. Our contribution to the field of computer vision is a flexible and easily implementable improvement to the performance of NeRFs for novel view synthesis.

Comparison to other methods, trained on only 3 synthetic images


Example scene reconstructed from 18 images


Other scenes with our method, also 18 images


Download my M.Sc. Thesis

View on Github