Understanding Patch-Based Learning of Video Data by Explaining Predictions

Christopher J. Anders, Grégoire Montavon, Wojciech Samek, Klaus Robert Müller

Research output: Chapter in Book/Report/Conference proceedingChapter

14 Citations (Scopus)

Abstract

Deep neural networks have shown to learn highly predictive models of video data. Due to the large number of images in individual videos, a common strategy for training is to repeatedly extract short clips with random offsets from the video. We apply the deep Taylor/Layer-wise Relevance Propagation (LRP) technique to understand classification decisions of a deep network trained with this strategy, and identify a tendency of the classifier to look mainly at the frames close to the temporal boundaries of its input clip. This “border effect” reveals the model’s relation to the step size used to extract consecutive video frames for its input, which we can then tune in order to improve the classifier’s accuracy without retraining the model. To our knowledge, this is the first work to apply the deep Taylor/LRP technique on any neural network operating on video data.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages297-309
Number of pages13
DOIs
Publication statusPublished - 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11700 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Bibliographical note

Funding Information:
This work was supported by the German Ministry for Education and Research as Berlin Big Data Centre (01IS14013A), Berlin Center for Machine Learning (01IS18037I) and TraMeExCo (01IS18056A). Partial funding by DFG is acknowledged (EXC 2046/1, project-ID: 390685689). This work was also supported by the Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (No. 2017-0-00451, No. 2017-0-01779).

Publisher Copyright:
© Springer Nature Switzerland AG 2019.

Keywords

  • Deep neural networks
  • Explaining predictions
  • Human action recognition
  • Video classification

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Understanding Patch-Based Learning of Video Data by Explaining Predictions'. Together they form a unique fingerprint.

Cite this