OAK

CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images

Metadata Downloads
Abstract
We present a novel framework for reconstructing animatable
human avatars from multiple images, termed CanonicalFusion. Our central concept involves integrating individual reconstruction results into the
canonical space. To be specific, we first predict Linear Blend Skinning
(LBS) weight maps and depth maps using a shared-encoder-dual-decoder
network, enabling direct canonicalization of the 3D mesh from the predicted depth maps. Here, instead of predicting high-dimensional skinning weights, we infer compressed skinning weights, i.e., 3-dimensional
vector, with the aid of pre-trained MLP networks. We also introduce a
forward skinning-based differentiable rendering scheme to merge the reconstructed results from multiple images. This scheme refines the initial
mesh by reposing the canonical mesh via the forward skinning and by
minimizing photometric and geometric errors between the rendered and
the predicted results. Our optimization scheme considers the position
and color of vertices as well as the joint angles for each image, thereby
mitigating the negative effects of pose errors. We conduct extensive experiments to demonstrate the effectiveness of our method and compare
our CanonicalFusion with state-of-the-art methods. Our source codes are
available at https://github.com/jsshin98/CanonicalFusion.
Author(s)
Shin, JisuJunmyeong LeeSeongmin LeeMin-Gyu ParkJu-Mi KangJu Hong YoonJeon, Hae-Gon
Issued Date
2024-10-02
Type
Conference Paper
URI
https://scholar.gist.ac.kr/handle/local/8151
Publisher
European Computer Vision Association (ECVA)
Citation
The 18th European Conference on Computer Vision (ECCV) 2024
ISSN
0302-9743
Conference Place
IT
MiCo Milano
Appears in Collections:
Department of AI Convergence > 2. Conference Papers
공개 및 라이선스
  • 공개 구분공개
파일 목록

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.