Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity
https://export.arxiv.org/abs/2009.02119
motion
,
GAN
,
graphics
,
KAIST
,
ETRI
,
2020