CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes
Kim Youwang, Kim Ji-Yeon, Tae-Hyun Oh
06/09/2022
Keywords: Human (Body), Regularization, Coarse-to-Fine, Positional Encoding
Venue: ECCV 2022
Bibtex:
@article{youwang2022clipactor,
author = {Kim Youwang and Kim Ji-Yeon and Tae-Hyun Oh},
title = {CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes},
year = {2022},
month = {Jun},
url = {http://arxiv.org/abs/2206.04382v2}
}
Abstract
We propose CLIP-Actor, a text-driven motion recommendation and neural mesh stylization system for human mesh animation. CLIP-Actor animates a 3D human mesh to conform to a text prompt by recommending a motion sequence and optimizing mesh style attributes. We build a text-driven human motion recommendation system by leveraging a large-scale human motion dataset with language labels. Given a natural language prompt, CLIP-Actor suggests a text-conforming human motion in a coarse-to-fine manner. Then, our novel zero-shot neural style optimization detailizes and texturizes the recommended mesh sequence to conform to the prompt in a temporally-consistent and pose-agnostic manner. This is distinctive in that prior work fails to generate plausible results when the pose of an artist-designed mesh does not conform to the text from the beginning. We further propose the spatio-temporal view augmentation and mask-weighted embedding attention, which stabilize the optimization process by leveraging multi-frame human motion and rejecting poorly rendered views. We demonstrate that CLIP-Actor produces plausible and human-recognizable style 3D human mesh in motion with detailed geometry and texture solely from a natural language prompt.
Citation Graph
(Double click on nodes to open corresponding papers' pages)
(Double click on nodes to open corresponding papers' pages)
* Showing citation graph for papers within our database. Data retrieved from Semantic Scholar. For full citation graphs, visit ConnectedPapers.