Speaker Anonymization for Personal Information Protection Using Voice Conversion Techniques

In Chul Yoo, Keonnyeong Lee, Seonggyun Leem, Hyunwoo Oh, Bonggu Ko, Dongsuk Yook

    Research output: Contribution to journalArticlepeer-review

    20 Citations (Scopus)

    Abstract

    As speech-based user interfaces integrated in the devices such as AI speakers become ubiquitous, a large amount of user voice data is being collected to enhance the accuracy of speech recognition systems. Since such voice data contain personal information that can endanger the privacy of users, the issue of privacy protection in the speech data has garnered increasing attention after the introduction of the General Data Protection Regulation in the EU, which implies that restrictions and safety measures for the use of speech data become essential. This study aims to filter the speaker-related voice biometrics present in speech data such as voicefingerprint without altering the linguistic content to preserve the usefulness of the data while protecting the privacy of users. To achieve this, we propose an algorithm that produces anonymized speeches by adopting many-to-many voice conversion techniques based on variational autoencoders (VAEs) and modifying the speaker identity vectors of the VAE input to anonymize the speech data. We validated the effectiveness of the proposed method by measuring the speaker-related information and the original linguistic information retained in the resultant speech, using an open source speaker recognizer and a deep neural network-based automatic speech recognizer, respectively. Using the proposed method, the speaker identification accuracy of the speech data was reduced to 0.1 9.2%, indicating successful anonymization, while the speech recognition accuracy was maintained as 78.2 81.3%.

    Original languageEnglish
    Pages (from-to)198637-198645
    Number of pages9
    JournalIEEE Access
    Volume8
    DOIs
    Publication statusPublished - 2020

    Bibliographical note

    Funding Information:
    This work was supported in part by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning under Grant NRF-2017R1E1A1A01078157, in part by the Ministry of Science and ICT (MSIT) through the Information Technology Research Center (ITRC) Support Program supervised by the Institute for Information & Communications Technology Promotion (IITP) under Grant IITP-2018-0-01405, in part by the IITP grant funded by the Korean Government (MSIP) (A research on safe and convenient big data processing methods) under Grant 2018-0-00269, and in part by the Korea University Grant.

    Publisher Copyright:
    © 2020 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.

    Keywords

    • Data privacy
    • Deep neural networks
    • Speaker anonymization
    • Variational autoencoder
    • Voice conversion

    ASJC Scopus subject areas

    • General Computer Science
    • General Materials Science
    • General Engineering

    Fingerprint

    Dive into the research topics of 'Speaker Anonymization for Personal Information Protection Using Voice Conversion Techniques'. Together they form a unique fingerprint.

    Cite this