MHCanonNet: Multi-Hypothesis Canonical lifting Network for self-supervised 3D human pose estimation in the wild video

Hyun Woo Kim, Gun Hee Lee, Woo Jeoung Nam, Kyung Min Jin, Tae Kyung Kang, Geon Jun Yang, Seong Whan Lee

Research output: Contribution to journalArticlepeer-review

Abstract

Recent advancements in 3D Human Pose Estimation using fully-supervised learning approach have shown impressive results; however, these methods heavily rely on large amounts of annotated 3D data, which are challenging to obtain outside controlled laboratory environments. Therefore, in this study, we propose a new self-supervised training method designed to train a 3D human pose estimation network using unlabeled multi-view images. The model trains relative depths between joints without any 3D annotation by satisfying multi-view consistency constraints from unlabeled multi-view videos without camera calibration, while simultaneously learning representations of multiple plausible pose hypotheses. For this reason, we call our proposed network a Multi-Hypothesis Canonical Lifting Network (MHCanonNet). By enriching the diversity of extracted features and keeping various possibilities open, our network accurately estimates the final 3D pose. The key idea lies in the design of a novel and unbiased reconstruction objective function that combines multiple hypotheses from different viewpoints. The proposed approach demonstrates state-of-the-art results not only on two popular benchmark datasets, Human3.6M and MPI-INF-3DHP but also on an in-the-wild dataset, Ski-Pose, surpassing existing self-supervised training methods.

Original languageEnglish
Article number109908
JournalPattern Recognition
Volume145
DOIs
Publication statusPublished - 2024 Jan

Bibliographical note

Funding Information:
This work was partially supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No. 2019-0-00079 , Artificial Intelligence Graduate School Program(Korea University) , No. 2022-0-00984 , Development of Artificial Intelligence Technology for Personalized Plug-and-Play Explanation and Verification of Explanation ).

Publisher Copyright:
© 2023 Elsevier Ltd

Keywords

  • 3D human pose
  • Multi-view geometry
  • Self-supervised learning

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'MHCanonNet: Multi-Hypothesis Canonical lifting Network for self-supervised 3D human pose estimation in the wild video'. Together they form a unique fingerprint.

Cite this