TY - GEN
T1 - Action Recognition with Domain Invariant Features of Skeleton Image
AU - Chen, Han
AU - Jiang, Yifan
AU - Ko, Hanseok
N1 - Funding Information:
This work was supported by the Major Project of the Korea Institute of Civil Engineering and Building Technology (KICT) [grant number number 20210397-001].
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Due to the fast processing-speed and robustness it can achieve, skeleton-based action recognition has recently received the attention of the computer vision community. The recent Convolutional Neural Network (CNN)-based methods have shown commendable performance in learning spatiooral representations for skeleton sequence, which use skeleton image as input to a CNN. Since the CNN-based methods mainly encoding the temporal and skeleton joints simply as rows and columns, respectively, the latent correlation related to all joints may be lost caused by the 2D convolution. To solve this problem, we propose a novel CNN-based method with adversarial training for action recognition. We introduce a two-level domain adversarial learning to align the features of skeleton images from different view angles or subjects, respectively, thus further improve the generalization. We evaluated our proposed method on NTU RGB+D. It achieves competitive results compared with state-of-the-art methods and 2.4%, 1.9%accuracy gain than the baseline for cross-subject and cross-view.
AB - Due to the fast processing-speed and robustness it can achieve, skeleton-based action recognition has recently received the attention of the computer vision community. The recent Convolutional Neural Network (CNN)-based methods have shown commendable performance in learning spatiooral representations for skeleton sequence, which use skeleton image as input to a CNN. Since the CNN-based methods mainly encoding the temporal and skeleton joints simply as rows and columns, respectively, the latent correlation related to all joints may be lost caused by the 2D convolution. To solve this problem, we propose a novel CNN-based method with adversarial training for action recognition. We introduce a two-level domain adversarial learning to align the features of skeleton images from different view angles or subjects, respectively, thus further improve the generalization. We evaluated our proposed method on NTU RGB+D. It achieves competitive results compared with state-of-the-art methods and 2.4%, 1.9%accuracy gain than the baseline for cross-subject and cross-view.
UR - http://www.scopus.com/inward/record.url?scp=85124951601&partnerID=8YFLogxK
U2 - 10.1109/AVSS52988.2021.9663824
DO - 10.1109/AVSS52988.2021.9663824
M3 - Conference contribution
AN - SCOPUS:85124951601
T3 - AVSS 2021 - 17th IEEE International Conference on Advanced Video and Signal-Based Surveillance
BT - AVSS 2021 - 17th IEEE International Conference on Advanced Video and Signal-Based Surveillance
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2021
Y2 - 16 November 2021 through 19 November 2021
ER -