BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text

Chanjun Park, Jaehyung Seo, Seolhwa Lee, Chanhee Lee, Hyeonseok Moon, Sugyeong Eo, Heuiseok Lim

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    22 Citations (Scopus)

    Abstract

    With the growing popularity of smart speakers, such as Amazon Alexa, speech is becoming one of the most important modes of human-computer interaction. Automatic speech recognition (ASR) is arguably the most critical component of such systems, as errors in speech recognition propagate to the downstream components and drastically degrade the user experience. A simple and effective way to improve the speech recognition accuracy is to apply automatic post-processor to the recognition result. However, training a post-processor requires parallel corpora created by human annotators, which are expensive and not scalable. To alleviate this problem, we propose Back TranScription (BTS), a denoising-based method that can create such corpora without human labor. Using a raw corpus, BTS corrupts the text using Text-to-Speech (TTS) and Speech-to-Text (STT) systems. Then, a post-processing model can be trained to reconstruct the original text given the corrupted input. Quantitative and qualitative evaluations show that a post-processor trained using our approach is highly effective in fixing non-trivial speech recognition errors such as mishandling foreign words. We present the generated parallel corpus and post-processing platform to make our results publicly available.

    Original languageEnglish
    Title of host publicationWAT 2021 - 8th Workshop on Asian Translation, Proceedings of the Workshop
    EditorsToshiaki Nakazawa, Hideki Nakayama, Isao Goto, Hideya Mino, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Shohei Higashiyama, Hiroshi Manabe, Win Pa Pa, Shantipriya Parida, Ondrej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Katsuhito Sudoh, Sadao Kurohashi, Pushpak Bhattacharyya
    PublisherAssociation for Computational Linguistics (ACL)
    Pages106-116
    Number of pages11
    ISBN (Electronic)9781954085633
    Publication statusPublished - 2021
    Event8th Workshop on Asian Translation, WAT 2021 - Virtual, Bangkok, Thailand
    Duration: 2021 Aug 52021 Aug 6

    Publication series

    NameWAT 2021 - 8th Workshop on Asian Translation, Proceedings of the Workshop

    Conference

    Conference8th Workshop on Asian Translation, WAT 2021
    Country/TerritoryThailand
    CityVirtual, Bangkok
    Period21/8/521/8/6

    Bibliographical note

    Funding Information:
    This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2018-0-01405) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation), Institute for Information & communications Technology Planning & Evaluation (IITP), grant funded by the Korean government (MSIT) (No. 2020-0-00368, A Neural-Symbolic Model for Knowledge Acquisition and Inference Techniques) and MSIT(Ministry of Science and ICT), Korea, under the ICT Creative Consilience program(IITP-2021-2020-0-01819) supervised by the IITP(Institute for Information & communications Technology Planning Evaluation).

    Publisher Copyright:
    © 2021 Association for Computational Linguistics.

    ASJC Scopus subject areas

    • Language and Linguistics
    • Computational Theory and Mathematics
    • Computer Science Applications
    • Software

    Fingerprint

    Dive into the research topics of 'BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text'. Together they form a unique fingerprint.

    Cite this