Leveraging Temporal Context in Human Pose Estimation: A Survey

Dana Skorvankova, Martin Madaras

2024

Abstract

Human pose estimation, the task of localizing skeletal joint positions from visual data, has witnessed significant progress with the advent of machine learning techniques. In this paper, we explore the landscape of deep learning-based methods for human pose estimation and investigate the impact of integrating temporal information into the computational framework. Our comparison covers the evolution from methods based on Convolutional Neural Networks (CNNs) to recurrent architectures and visual transformers. While spatial information alone provides valuable insights, we delve into the benefits of incorporating temporal information, enhancing robustness and adaptability to dynamic human movements. The surveyed methods are adapted to fit the requirements of human pose estimation task, and are evaluated on a real large scale dataset, focusing on a single-person scenario, inferring from 3D point cloud inputs. We present results and insights, showcasing the trade-offs between accuracy, memory requirements, and training time for various approaches. Furthermore, our findings demonstrate that models relying on attention mechanisms can achieve competitive outcomes in the realm of human pose estimation within a limited number of trainable parameters. The survey aims to provide a comprehensive overview of machine learning-based human pose estimation techniques, emphasizing the evolution towards temporally-aware models and identifying challenges and opportunities in this rapidly evolving field.

Download


Paper Citation


in Harvard Style

Skorvankova D. and Madaras M. (2024). Leveraging Temporal Context in Human Pose Estimation: A Survey. In Proceedings of the 4th International Conference on Image Processing and Vision Engineering - Volume 1: IMPROVE; ISBN 978-989-758-693-4, SciTePress, pages 83-90. DOI: 10.5220/0012696800003720


in Bibtex Style

@conference{improve24,
author={Dana Skorvankova and Martin Madaras},
title={Leveraging Temporal Context in Human Pose Estimation: A Survey},
booktitle={Proceedings of the 4th International Conference on Image Processing and Vision Engineering - Volume 1: IMPROVE},
year={2024},
pages={83-90},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012696800003720},
isbn={978-989-758-693-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 4th International Conference on Image Processing and Vision Engineering - Volume 1: IMPROVE
TI - Leveraging Temporal Context in Human Pose Estimation: A Survey
SN - 978-989-758-693-4
AU - Skorvankova D.
AU - Madaras M.
PY - 2024
SP - 83
EP - 90
DO - 10.5220/0012696800003720
PB - SciTePress