Talks and presentations

Neural Radiance Field Reconstruction with Depth and Normal Constraints

March 10, 2023

M.Sc. Thesis Defense Talk, TU Munich, School of computation, information and technology - informatics, Munich, Germany

The talk for my M.Sc. thesis that addresses the problem of novel view synthesis from sparse-view supervision in the field of computer vision. Neural radiance fields (NeRFs) are a popular approach for this problem, but they rely heavily on a large dataset of images and precisely calibrated cameras. Motivated by recent advances in the area of monocular geometry prediction, which allow for cheap generation of depth- and normal maps, we systematically explore methods to incorporate these cues for the supervision of NeRFs. Our proposed method bounds the weights accumulated along rays using a Gaussian cumulative density function about the predicted depth. These bounds are directly derived from a Gaussian assumption on the likelihood of a ray being absorbed on its way through a neural volume. We show that our method, contrary to prior work, consistently improves reconstruction results for any number of training views, with photorealistic reconstructions being feasible with as few as three views. Our contribution to the field of computer vision is a flexible and easily implementable improvement to the performance of NeRFs for novel view synthesis.

3D Reconstruction from single RGB-D Images

September 03, 2022

Talk, TU Munich, School of computation, information and technology - informatics, Munich, Germany

The research introduces a two-part architecture for 3D reconstruction from a single RGB image. The first segment utilizes a UNet that predicts a depth map from the RGB input. This depth map is then voxelized into an incomplete occupancy grid. The second segment, IF-Net, completes this incomplete data using additional supervision from the ground truth mesh. The entire pipeline is trained end-to-end using differentiable voxelization. The study demonstrates the potential of IF-Nets for reconstructing large, intricate scenes and extends its capabilities with a depth regressor and differential voxelization.