Resmax: Detecting voice spoofing attacks with residual network and max feature map

Il Youp Kwak, Sungsu Kwag, Junhee Lee, Jun Ho Huh, Choong Hoon Lee, Youngbae Jeon, Jeonghwan Hwang, Ji Won Yoon

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    17 Citations (Scopus)

    Abstract

    The “2019 Automatic Speaker Verification Spoofing And Countermeasures Challenge” (ASVspoof) competition aimed to facilitate the design of highly accurate voice spoofing attack detection systems. the competition did not emphasize model complexity and latency requirements; such constraints are strict and integral in real-world deployment. Hence, most of the top performing solutions from the competition all used an ensemble approach, and combined multiple complex deep learning models to maximize detection accuracy - this kind of approach would sit uneasily with real-world deployment constraints. To design a lightweight system, we combined the notions of skip connection (from ResNet) and max feature map (from Light CNN), and evaluated the accuracy of the system using the ASVspoof 2019 dataset. With an optimized constant Q transform (CQT) feature, our single model achieved a replay attack detection equal error rate (EER) of 0.37% on the evaluation set, surpassing the top ensemble system from the competition that achieved an EER of 0.39%.

    Original languageEnglish
    Title of host publicationProceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages4837-4844
    Number of pages8
    ISBN (Electronic)9781728188089
    DOIs
    Publication statusPublished - 2020
    Event25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Milan, Italy
    Duration: 2021 Jan 102021 Jan 15

    Publication series

    NameProceedings - International Conference on Pattern Recognition
    ISSN (Print)1051-4651

    Conference

    Conference25th International Conference on Pattern Recognition, ICPR 2020
    Country/TerritoryItaly
    CityVirtual, Milan
    Period21/1/1021/1/15

    Bibliographical note

    Funding Information:
    Acknowledgment This work was conducted at Samsung Research. The authors would like to thank Samsung Research Security Team for the helpful discussions. IK was supported by the National Research Foundation of Korea(NRF) grant funded by Ministry of Science and ICT (2020R1C1C1A01013020)

    Funding Information:
    This work was conducted at Samsung Research. The authors would like to thank Samsung Research Security Team for the helpful discussions. IK was supported by the National Research Foundation of Korea(NRF) grant funded by Ministry of Science and ICT (2020R1C1C1A01013020)

    Publisher Copyright:
    © 2020 IEEE

    Keywords

    • Voice assistant security
    • Voice presentation attack detection
    • Voice spoofing attack
    • Voice synthesis attack

    ASJC Scopus subject areas

    • Computer Vision and Pattern Recognition

    Fingerprint

    Dive into the research topics of 'Resmax: Detecting voice spoofing attacks with residual network and max feature map'. Together they form a unique fingerprint.

    Cite this