Detection and Defense: Student-Teacher Network for Adversarial Robustness

Kyoungchan Park, Pilsung Kang

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Defense against adversarial attacks is critical for the reliability and safety of deep neural networks (DNNs). Current state-of-the-art defense methods achieve significant robustness against adversarial attacks. However, such defense methods cannot distinguish between adversarial examples (AEs) and normal examples (NEs). Thus, they apply the same defense process for both examples to perform classification, resulting in performance degradation for NEs. In this paper, we propose a novel defense method based on the student-teacher framework that can minimize the classification performance degradation for NEs by detecting AEs and then applying the defense process only to AEs. Focusing on the fact that distortion in the hidden layer features is inevitable for the success of adversarial attacks, we train the student network to predict the undistorted hidden layer features of the teacher network (target DNN). Therefore, our method can detect AEs through the difference in the hidden layer features between the student and teacher network, and then recover the classification result of AEs using the penultimate layer features predicted by the student network. Through extensive experiments on representative image classification benchmark datasets, i.e., CIFAR-10, CIFAR-100, and TinyImagenet, we demonstrate the superiority of our method in both detection and defense compared with state-of-the-art methods. Furthermore, we show that our method achieves robust detection and defense performance for a fully white-box attack that assumes an attacker knows the information of our entire detection and defense mechanism.

    Original languageEnglish
    Pages (from-to)82742-82752
    Number of pages11
    JournalIEEE Access
    Volume12
    DOIs
    Publication statusPublished - 2024

    Bibliographical note

    Publisher Copyright:
    © 2013 IEEE.

    Keywords

    • Adversarial attack
    • adversarial defense
    • adversarial detection
    • student-teacher network

    ASJC Scopus subject areas

    • General Computer Science
    • General Materials Science
    • General Engineering

    Fingerprint

    Dive into the research topics of 'Detection and Defense: Student-Teacher Network for Adversarial Robustness'. Together they form a unique fingerprint.

    Cite this