Classification-based Multi-task Learning for Efficient Pose Estimation Network

Dongoh Kang, Myung Cheol Roh, Hansaem Kim, Yonghyun Kim, Seong Whan Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Human pose estimation is an interesting and underlying topic in various fields such as action recognition and human-computer interaction. Although many methods have been developed recently, they are still far from perfect in accuracy and speed at a time. In this paper, we propose a Classification-based Pose Estimation Network with Multi-task Learning (CPENML) based on the low-resolution feature map to improve accuracy and inference time simultaneously. The proposed CPENML consists of two ideas. Firstly, novel proposed keypoint and offset estimation tasks based on classification achieve better performance than regression. Secondly, the proposed Multi-Scale Network (MSN) makes robust feature maps and balances the keypoint and offset tasks to maximize performance. To prove the effectiveness of the proposed method, we conduct ablation studies on the COCO dataset for proposed ideas. Compared to benchmarks, we demonstrate the superiority of our proposed method on COCO dataset in terms of inference time and accuracy.

Original languageEnglish
Title of host publication2022 26th International Conference on Pattern Recognition, ICPR 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages8
ISBN (Electronic)9781665490627
Publication statusPublished - 2022
Event26th International Conference on Pattern Recognition, ICPR 2022 - Montreal, Canada
Duration: 2022 Aug 212022 Aug 25

Publication series

NameProceedings - International Conference on Pattern Recognition
ISSN (Print)1051-4651


Conference26th International Conference on Pattern Recognition, ICPR 2022

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition


Dive into the research topics of 'Classification-based Multi-task Learning for Efficient Pose Estimation Network'. Together they form a unique fingerprint.

Cite this