Hand Gesture Recognition Using MediaPipe Landmarks and Deep Learning Networks

Manuel Gil-Martín; Marco Marini; Iván Martín-Fernández; Sergio Esteban-Romero; Luigi Cinque

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Hand Gesture Recognition Using MediaPipe Landmarks and Deep Learning Networks

Topics: Deep Learning; Machine Learning; Neural Networks; Vision and Perception

In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART, 24-30, 2025 , Porto, Portugal

Authors: Manuel Gil-Martín ¹ ; Marco Raoul Marini ² ; Iván Martín-Fernández ¹ ; Sergio Esteban-Romero ¹ and Luigi Cinque ²

Affiliations: ¹ Grupo de Tecnología del Habla y Aprendizaje Automático (THAU Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid (UPM), Av. Complutense 30, 28040, Madrid, Spain ; ² VisionLab, Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy

Keyword(s): Hand Gesture Recognition, Human-Computer Interaction, MediaPipe Landmarks, Deep Learning.

Abstract: .Advanced Human Computer Interaction techniques are commonly used in multiple application areas, from entertainment to rehabilitation. In this context, this paper proposes a framework to recognize hand gestures using a limited number of landmarks from the video images. This hand gesture recognition system comprises an image processing module that extracts and processes the coordinates of 21 hand points called landmarks, and a deep neural network module that models and classifies the hand gestures. These landmarks are extracted automatically through MediaPipe software. The experiments were carried out over the IPN Hand dataset in an independent-user scenario using a Subject-Wise Cross Validation. They cover the use of different landmark-based formats, normalizations, lengths of the gesture representations, and number of landmarks used as inputs. The system obtains significantly better accuracy when using the raw coordinates of the 21 landmarks through 125 timesteps and a light Recurre nt Neural Network architecture (80.56 ± 1.19 %) or the hand anthropometric measures (82.20 ± 1.15 %) compared to using the speed of the hand landmarks through the gesture (72.93 ± 1.34 %). The proposed framework studied the effect of different landmark-based normalizations over the raw coordinates, obtaining an accuracy of 83.67 ± 1.12 % when using as reference the wrist landmark from each frame, and an accuracy of 84.66 ± 1.09 % when using as reference the wrist landmark from the first video frame of the current gesture. In addition, the proposed solution provided high recognition performance even when only using the coordinates from 6 (82.15 ± 1.16 %) or 4 (81.46 ± 1.17 %) specific hand landmarks using as reference the wrist landmark from the first video frame of the current gesture. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 18.222.156.75

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Gil-Martín, M., Marini, M. R., Martín-Fernández, I., Esteban-Romero, S. and Cinque, L. (2025). Hand Gesture Recognition Using MediaPipe Landmarks and Deep Learning Networks. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-737-5; ISSN 2184-433X, SciTePress, pages 24-30. DOI: 10.5220/0013053500003890

@conference{icaart25,
author={Manuel Gil{-}Martín and Marco Raoul Marini and Iván Martín{-}Fernández and Sergio Esteban{-}Romero and Luigi Cinque},
title={Hand Gesture Recognition Using MediaPipe Landmarks and Deep Learning Networks},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2025},
pages={24-30},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013053500003890},
isbn={978-989-758-737-5},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Hand Gesture Recognition Using MediaPipe Landmarks and Deep Learning Networks
SN - 978-989-758-737-5
IS - 2184-433X
AU - Gil-Martín, M.
AU - Marini, M.
AU - Martín-Fernández, I.
AU - Esteban-Romero, S.
AU - Cinque, L.
PY - 2025
SP - 24
EP - 30
DO - 10.5220/0013053500003890
PB - SciTePress