Бібліографічний опис:
Kovalenko Danylo. Text-Guided 3D Synthesis with Latent Diffusion Models, Faculty of Applied Sciences, Department of Computer Sciences. Lviv 2023, 38 p.
Короткий опис (реферат):
The emergence of diffusion models has greatly impacted the field of deep generative
models, establishing them as a powerful family of models with unparalleled performance in various applications such as text-to-image, image-to-image, and text-toaudio tasks. In this work, we aim to propose a solution for text-guided 3D synthesis
using denoising diffusion probabilistic models, while minimizing the memory and
computational requirements. Our goal is to achieve high-quality and high-fidelity
3D object generation conditioned by text or a label in a number of seconds. We propose to use a triplane space parametrization in combination with a Latent Diffusion
Model (LDM) to generate smooth and coherent geometry. The LDM is trained on
the large-scale text-3d dataset and is used as a latent triplane texture generator. By
using a triplane space parametrization, we aim to improve the efficiency of the space
representation and reduce the computational cost of synthesis. We will also give a
theoretical justification that this kind of parametrization of 3d space is capable of
containing not only information about the geometry but also about the color and
reflectivity of the figure. Additionally, we use an implicit neural renderer to decode
geometry details from triplane textures.