The KU-ISPL entry to the GENEA Challenge 2023-A Diffusion Model for Co-speech Gesture generation

Gwantae Kim, Yuanming Li, Hanseok Ko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper describes a diffusion model for co-speech gesture generation presented by KU-ISPL entry of the GENEA Challenge 2023. We formulate the gesture generation problem as a co-speech gesture generation problem and a semantic gesture generation problem, and we focus on solving the co-speech gesture generation problem by denoising diffusion probabilistic model with text, audio, and pre-pose conditions. We use the U-Net with cross-attention architecture as a denoising model, and we propose a gesture autoencoder as a mapping function from the gesture domain to the latent domain. The collective evaluation released by GENEA Challenge 2023 shows that our model successfully generates co-speech gestures. Our system receives a mean human-likeness score of 32.0, a preference-matched score of appropriateness for the main agent speech of 53.6%, and an interlocutor speech appropriateness score of 53.5%. We also conduct an ablation study to measure the effects of the pre-pose. By the results, our system contributes to the co-speech gesture generation for natural interaction.

Original languageEnglish
Title of host publicationICMI 2023 Companion - Companion Publication of the 25th International Conference on Multimodal Interaction
PublisherAssociation for Computing Machinery
Pages220-227
Number of pages8
ISBN (Electronic)9798400703218
DOIs
Publication statusPublished - 2023 Oct 9
Event25th International Conference on Multimodal Interaction, ICMI 2023 Companion - Paris, France
Duration: 2023 Oct 92023 Oct 13

Publication series

NameACM International Conference Proceeding Series

Conference

Conference25th International Conference on Multimodal Interaction, ICMI 2023 Companion
Country/TerritoryFrance
CityParis
Period23/10/923/10/13

Bibliographical note

Publisher Copyright:
© 2023 ACM.

Keywords

  • GENEA Challenge
  • co-speech gesture generation
  • diffusion
  • generative models
  • neural networks

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint

Dive into the research topics of 'The KU-ISPL entry to the GENEA Challenge 2023-A Diffusion Model for Co-speech Gesture generation'. Together they form a unique fingerprint.

Cite this