Text-Guided 3D Synthesis with Latent Diffusion Models

Kovalenko, Danylo

Home
→
Students Research & Project Works | Роботи студентів
→
Факультет прикладних наук
→
Освітня програма наук про дані
→
2023
→
View Item

dc.contributor.author	Kovalenko, Danylo
dc.date.accessioned	2023-07-14T07:30:16Z
dc.date.available	2023-07-14T07:30:16Z
dc.date.issued	2023
dc.identifier.citation	Kovalenko Danylo. Text-Guided 3D Synthesis with Latent Diffusion Models, Faculty of Applied Sciences, Department of Computer Sciences. Lviv 2023, 38 p.	uk
dc.identifier.uri	https://er.ucu.edu.ua/handle/1/3945
dc.description.abstract	The emergence of diffusion models has greatly impacted the field of deep generative models, establishing them as a powerful family of models with unparalleled performance in various applications such as text-to-image, image-to-image, and text-toaudio tasks. In this work, we aim to propose a solution for text-guided 3D synthesis using denoising diffusion probabilistic models, while minimizing the memory and computational requirements. Our goal is to achieve high-quality and high-fidelity 3D object generation conditioned by text or a label in a number of seconds. We propose to use a triplane space parametrization in combination with a Latent Diffusion Model (LDM) to generate smooth and coherent geometry. The LDM is trained on the large-scale text-3d dataset and is used as a latent triplane texture generator. By using a triplane space parametrization, we aim to improve the efficiency of the space representation and reduce the computational cost of synthesis. We will also give a theoretical justification that this kind of parametrization of 3d space is capable of containing not only information about the geometry but also about the color and reflectivity of the figure. Additionally, we use an implicit neural renderer to decode geometry details from triplane textures.	uk
dc.language.iso	en	uk
dc.title	Text-Guided 3D Synthesis with Latent Diffusion Models	uk
dc.type	Preprint	uk
dc.status	Публікується вперше	uk