Neural 3D Video Synthesis

Tianye Li; Mira Slavcheva; Michael Zollhoefer; Simon Green; Christoph Lassner; Changil Kim; Tanner Schmidt; Steven Lovegrove; Michael Goesele; Zhaoyang Lv

Neural 3D Video Synthesis

Tianye Li, Mira Slavcheva, Michael Zollhoefer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, Zhaoyang Lv

3/3/2021

Keywords: Dynamic/Temporal, Global Conditioning, Local Conditioning

Venue: ARXIV 2021

Paper Citation Data coming soon...

Bibtex: @article{li2021dynerf, journal = {arXiv preprint arXiv:2103.02597}, booktitle = {ArXiv Pre-print}, author = {Tianye Li and Mira Slavcheva and Michael Zollhoefer and Simon Green and Christoph Lassner and Changil Kim and Tanner Schmidt and Steven Lovegrove and Michael Goesele and Zhaoyang Lv}, title = {Neural 3D Video Synthesis}, year = {2021}, url = {http://arxiv.org/abs/2103.02597v1}, entrytype = {article}, id = {li2021dynerf} }

Abstract

We propose a novel approach for 3D video synthesis that is able to represent multi-view video recordings of a dynamic real-world scene in a compact, yet expressive representation that enables high-quality view synthesis and motion interpolation. Our approach takes the high quality and compactness of static neural radiance fields in a new direction: to a model-free, dynamic setting. At the core of our approach is a novel time-conditioned neural radiance fields that represents scene dynamics using a set of compact latent codes. To exploit the fact that changes between adjacent frames of a video are typically small and locally consistent, we propose two novel strategies for efficient training of our neural network: 1) An efficient hierarchical training scheme, and 2) an importance sampling strategy that selects the next rays for training based on the temporal variation of the input videos. In combination, these two strategies significantly boost the training speed, lead to fast convergence of the training process, and enable high quality results. Our learned representation is highly compact and able to represent a 10 second 30 FPS multi-view video recording by 18 cameras with a model size of just 28MB. We demonstrate that our method can render high-fidelity wide-angle novel views at over 1K resolution, even for highly complex and dynamic scenes. We perform an extensive qualitative and quantitative evaluation that shows that our approach outperforms the current state of the art. We include additional video and information at: https://neural-3d-video.github.io/

Project Webpage

Citation Graph
(Double click on nodes to open corresponding papers' pages)

* Showing citation graph for papers within our database. Data retrieved from Semantic Scholar. For full citation graphs, visit ConnectedPapers.