Understanding Patch-Based Learning of Video Data by Explaining Predictions

Christopher J. Anders, Grégoire Montavon, Wojciech Samek, Klaus Robert Müller

    Research output: Chapter in Book/Report/Conference proceedingChapter

    20 Citations (Scopus)

    Abstract

    Deep neural networks have shown to learn highly predictive models of video data. Due to the large number of images in individual videos, a common strategy for training is to repeatedly extract short clips with random offsets from the video. We apply the deep Taylor/Layer-wise Relevance Propagation (LRP) technique to understand classification decisions of a deep network trained with this strategy, and identify a tendency of the classifier to look mainly at the frames close to the temporal boundaries of its input clip. This “border effect” reveals the model’s relation to the step size used to extract consecutive video frames for its input, which we can then tune in order to improve the classifier’s accuracy without retraining the model. To our knowledge, this is the first work to apply the deep Taylor/LRP technique on any neural network operating on video data.

    Original languageEnglish
    Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    PublisherSpringer Verlag
    Pages297-309
    Number of pages13
    DOIs
    Publication statusPublished - 2019

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume11700 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Bibliographical note

    Funding Information:
    This work was supported by the German Ministry for Education and Research as Berlin Big Data Centre (01IS14013A), Berlin Center for Machine Learning (01IS18037I) and TraMeExCo (01IS18056A). Partial funding by DFG is acknowledged (EXC 2046/1, project-ID: 390685689). This work was also supported by the Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (No. 2017-0-00451, No. 2017-0-01779).

    Publisher Copyright:
    © Springer Nature Switzerland AG 2019.

    Keywords

    • Deep neural networks
    • Explaining predictions
    • Human action recognition
    • Video classification

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Understanding Patch-Based Learning of Video Data by Explaining Predictions'. Together they form a unique fingerprint.

    Cite this