Prediction of partially observed human activity based on pre-trained deep representation

Dong Gyu Lee, Seong Whan Lee

Research output: Contribution to journalArticlepeer-review

21 Citations (Scopus)


Prediction of complex human activities from a partially observed video is valuable in many practical applications but is a challenging problem. When a video is partially observed, maximizing the representational power of the given video is more important than modeling the temporal dynamics of the activity. In this paper, we propose a novel human activity descriptor for prediction, which can maximize the discriminative power of a system in a compact and efficient way using pre-trained deep networks. Specifically, the proposed descriptor can capture the potentially important pairwise relationships between objects without prior knowledge or preset attributes. The relationship information is automatically reflected during the descriptor construction procedure based on object's participation ratios, local and global motion activations. Pre-trained Convolutional Neural Networks are utilized without additional model training procedure. From a practical point of view, the proposed method is more cost-effective when implementing a smart surveillance system. In the experiments, we evaluate the proposed methods in two cases: (1) prediction accuracy with different observation ratios, and (2) the effect of pre-trained network and layer selection. Experimental results from five public datasets verified the efficacy of the proposed method by outperforming competing methods with stable high-performance regardless of network selection.

Original languageEnglish
Pages (from-to)198-206
Number of pages9
JournalPattern Recognition
Publication statusPublished - 2019 Jan


  • Human activity prediction
  • Human interaction
  • Pre-trained CNN
  • Sub-volume co-occurrence matrix

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence


Dive into the research topics of 'Prediction of partially observed human activity based on pre-trained deep representation'. Together they form a unique fingerprint.

Cite this