Due to the fast processing-speed and robustness it can achieve, skeleton-based action recognition has recently received the attention of the computer vision community. The recent Convolutional Neural Network (CNN)-based methods have shown commendable performance in learning spatiooral representations for skeleton sequence, which use skeleton image as input to a CNN. Since the CNN-based methods mainly encoding the temporal and skeleton joints simply as rows and columns, respectively, the latent correlation related to all joints may be lost caused by the 2D convolution. To solve this problem, we propose a novel CNN-based method with adversarial training for action recognition. We introduce a two-level domain adversarial learning to align the features of skeleton images from different view angles or subjects, respectively, thus further improve the generalization. We evaluated our proposed method on NTU RGB+D. It achieves competitive results compared with state-of-the-art methods and 2.4%, 1.9%accuracy gain than the baseline for cross-subject and cross-view.
|Title of host publication||AVSS 2021 - 17th IEEE International Conference on Advanced Video and Signal-Based Surveillance|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Publication status||Published - 2021|
|Event||17th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2021 - Virtual, Online, United States|
Duration: 2021 Nov 16 → 2021 Nov 19
|Name||AVSS 2021 - 17th IEEE International Conference on Advanced Video and Signal-Based Surveillance|
|Conference||17th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2021|
|Period||21/11/16 → 21/11/19|
Bibliographical noteFunding Information:
This work was supported by the Major Project of the Korea Institute of Civil Engineering and Building Technology (KICT) [grant number number 20210397-001].
© 2021 IEEE.
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Signal Processing
- Media Technology