Adaptive Multi-Domain Dialogue State Tracking on Spoken Conversations

Jungwoo Lim, Taesun Whang, Dongyub Lee, Heuiseok Lim

Research output: Contribution to journalArticlepeer-review


The main objective of the task-oriented dialogue system is to identify the intent and needs of human dialogue. Many existing studies are conducted under the setting of written dialogue, but there always exists a difficulty in coping with real-world spoken dialogues. To this end, DSTC10 challenge organizers propose the task of building robust dialogue state tracking (DST) models on spoken dialogues. With the powerful existing DST model (i.e., MinTL), this article suggests integral components for building a dialogue state tracker; 1) Data augmentation effectively enhances the capability of the model to catch the entities that exist in the evaluation dataset. 2) Levenshtein post-processing aims to prevent the distortion in model prediction caused by automatic speech recognition errors. To validate the effectiveness of our methods, we evaluate our model on DSTC10 datasets and conduct qualitative analysis by ablating each component of the model. Experimental results show that our model significantly outperforms baselines in all evaluation metrics and took 3rd place in the challenge.

Original languageEnglish
Pages (from-to)727-732
Number of pages6
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Publication statusPublished - 2024

Bibliographical note

Publisher Copyright:
© 2023 The Authors.


  • DSTC10
  • dialogue state tracking
  • spoken dialogue

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computational Mathematics
  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics


Dive into the research topics of 'Adaptive Multi-Domain Dialogue State Tracking on Spoken Conversations'. Together they form a unique fingerprint.

Cite this