ConvNeXtPose: A Fast Accurate Method for 3D Human Pose Estimation and Its AR Fitness Application in Mobile Devices

Hong Son Nguyen, Myounggon Kim, Changbin Im, Sanghoon Han, Jung Hyun Han

Research output: Contribution to journalArticlepeer-review

Abstract

In general, 3D human-pose estimation requires high-performance computing resources. Existing methods working on mobile devices trade off accuracy in return for increased efficiency, often making the estimation accuracy far from sufficient for developing serious applications. In this paper, we present a mobile 3D human-pose estimation model, achieving real-time performances with a well-designed balance between efficiency and accuracy. As the backbone, our model leverages the cutting-edge ConvNeXt architecture, renowned for its feature extraction capabilities. We enhance its performance through strategic architectural modifications and incorporation of depthwise separable convolutions in the upsampling module. The experiments made with the Human3.6M dataset show that the accuracy delivered by our model is comparable to that of the state-of-the-art models, consuming significantly fewer computational resources. To showcase the practicality of our model, we present a prototype of an AR fitness application. Built upon our 3D human pose estimation model, it helps trainees recreate trainers' poses from reference images. The effectiveness of the application is validated via experiments and evaluations. The source code can be found at: https://github.com/medialab-ku/ConvNeXtPose.

Original languageEnglish
Pages (from-to)117393-117402
Number of pages10
JournalIEEE Access
Volume11
DOIs
Publication statusPublished - 2023

Bibliographical note

Publisher Copyright:
© 2023 The Authors.

Keywords

  • 3D human pose estimation
  • Augmented reality
  • pose correction
  • pose matching

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'ConvNeXtPose: A Fast Accurate Method for 3D Human Pose Estimation and Its AR Fitness Application in Mobile Devices'. Together they form a unique fingerprint.

Cite this