Sequence-to-sequence video prediction by learning hierarchical representations

  • Kun Fan
  • , Chungin Joung
  • , Seungjun Baek*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)

Abstract

Video prediction which maps a sequence of past video frames into realistic future video frames is a challenging task because it is difficult to generate realistic frames and model the coherent relationship between consecutive video frames. In this paper, we propose a hierarchical sequence-to-sequence prediction approach to address this challenge. We present an end-to-end trainable architecture in which the frame generator automatically encodes input frames into different levels of latent Convolutional Neural Network (CNN) features, and then recursively generates future frames conditioned on the estimated hierarchical CNN features and previous prediction. Our design is intended to automatically learn hierarchical representations of video and their temporal dynamics. Convolutional Long Short-Term Memory (ConvLSTM) is used in combination with skip connections so as to separately capture the sequential structures of multiple levels of hierarchy of features. We adopt Scheduled Sampling for training our recurrent network in order to facilitate convergence and to produce high-quality sequence predictions. We evaluate our method on the Bouncing Balls, Moving MNIST, and KTH human action dataset, and report favorable results as compared to existing methods.

Original languageEnglish
Article number8288
Pages (from-to)1-14
Number of pages14
JournalApplied Sciences (Switzerland)
Volume10
Issue number22
DOIs
Publication statusPublished - 2020 Nov 2

Bibliographical note

Publisher Copyright:
© 2020 by the authors. Licensee MDPI, Basel, Switzerland.

Keywords

  • Convolutional neural network
  • Hierarchical features
  • Long short-term memory
  • Recurrent neural network
  • Video prediction

ASJC Scopus subject areas

  • General Materials Science
  • Instrumentation
  • General Engineering
  • Process Chemistry and Technology
  • Computer Science Applications
  • Fluid Flow and Transfer Processes

Fingerprint

Dive into the research topics of 'Sequence-to-sequence video prediction by learning hierarchical representations'. Together they form a unique fingerprint.

Cite this