2D LiDAR-Based Human Pose Tracking for a Mobile Robot

Zhenyu Gao

, Ze Wang, Ludovic Saint-Bauzel

and Fa

ız Ben Amar

Sorbonne University, CNRS, UMR 7222, Institut des Systemes Intelligents et Robotique - ISIR, France

Keywords:

Human Detection and Tracking, Human Orientation Estimation, Service Robotics.

Abstract:

Human pose tracking is a practical feature for service robots, which allows the robot to predict the user’s

trajectory and behavior and thus provide appropriate assistance for them. In this paper, we propose a human

pose tracking method based on a knee-high 2D LiDAR mounted on the mobile robot. Inspired by human gait,

a motion intention zoning, and a walking gait model are proposed to adapt to various motion patterns and

achieve accurate orientation estimation. We propose a Kalman Filter-based human pose tracker that considers

the leg occlusion problem and the data association of legs. We evaluate the proposed method’s performance

in various complex scenarios and demonstrate robustness to leg occlusion. We released our implementation as

open-source code

∗

1 INTRODUCTION

Mobile robots have become more prevalent in every

corner of our lives: shopping malls, hospitals, lo-

gistics warehouses, factories, homes, and many oth-

ers. Many tasks in these applications are still shared

between robots and human operators, either because

human expertise or agility is required or because the

robot can assist the person. In this paper, we are in-

terested in the interaction between a mobile robot and

a human, particularly in the automatic pose tracking

of the person by the mobile robot. In previous studies,

common human tracking approaches obtain the user’s

position or pose through sensors with distance infor-

mation such as RGB-D cameras and/or 2D LiDAR

(Jung et al., 2012)(Ho et al., 2012)(Hu et al., 2013).

Among them, 2D LiDAR is more widely used for hu-

man tracking due to the larger ﬁeld of view (FOV), ro-

bust light adaptability, and excellent accuracy. Some

2D LiDAR-based methods for extracting human posi-

tion from the 2D raw point clouds have already been

proposed. Usually, the point cloud is ﬁrst segmented

by clustering, e.g., density-based spatial clustering

(DBSCAN) (Hasan et al., 2021), heuristic clustering

(Zhao and Shibasaki, 2005)(Chung et al., 2011), and

simple distance segmentation (Leigh et al., 2015)(Lee

https://orcid.org/0000-0002-6141-806X

https://orcid.org/0000-0003-4372-4917

https://orcid.org/0000-0002-4590-3452

∗

The code is available at https://github.com/SyRoCo-

ISIR/Frontal human following

et al., 2006). Clusters are then identiﬁed as humans or

other objects according to their different contours in

the point cloud. Some classiﬁers such as Bayesian

classiﬁer (Tamas et al., 2010), Support Vector Data

Description (SVDD) (Chung et al., 2011)(Jung et al.,

2013) and Adaptive boosting (Adaboost) (Arras et al.,

2008)(Mozos et al., 2010) have been implemented to

label clusters by their geometric features, spatial fea-

tures, etc. Besides, a Convolutional Neural Network

(CNN) has been adapted instead of clustering and

classiﬁers, input with 2D (Guerrero-Higueras et al.,

2019) or 3D (Br

c et al., 2020) raw point clouds and

output the target labels and positions.

Besides human identiﬁcation and position track-

ing, research on human orientation is increasingly

in demand. Some speciﬁc services require mobile

robots to come in front of users. For example, as-

sistive robots act like guide dogs to provide naviga-

tion aid to the visually impaired (Xiao et al., 2021),

and smart walkers provide walking assistance to peo-

ple with mobility impairments (Lee et al., 2013)(Page

et al., 2015). Some studies have shown that people

prefer to see the robots in their ﬁeld of vision and

may feel uncomfortable and unsafe when the robots

appear behind them (Jung et al., 2012). For some so-

cial robots, being in front of the user will facilitate

communication with him/her. Human orientation es-

timation has been widely studied for frontal human

following, where the most common method is based

on human velocity direction. (Ho et al., 2012) de-

ﬁned the velocity direction as the sagittal axis and im-

Gao, Z., Wang, Z., Saint-Bauzel, L. and Ben Amar, F.

2D LiDAR-Based Human Pose Tracking for a Mobile Robot.

DOI: 10.5220/0012255600003543

In Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2023) - Volume 1, pages 511-519

ISBN: 978-989-758-670-5; ISSN: 2184-2809

511

plemented an Unscented Kalman Filter (UKF) with a

non-holonomic human movement model to improve

human pose estimation. A spin turn observer was ap-

plied to deal with spin turns. Based on the human

pose, the controller kept the robot in front of the hu-

man and aligned it to the same orientation as the hu-

man. (Nikdel et al., 2018) also assumed the human

velocity direction as the sagittal axis. They set up an

Inﬁnite Impulse Response (IIR) ﬁlter to smooth hu-

man velocity and orientation changes. A new human

motion model based on the surrounding environment

was proposed to improve human following perfor-

mance. Their recent study implemented Reinforce-

ment Learning (RL) (Nikdel et al., 2021) to output

short-time navigation goals and employed a Timed

Elastic Band (TEB) local planner to keep the robot

in front. However, humans are redundant and have a

lot of possible mobilities. Some speciﬁc human mo-

tion patterns, such as turn-in-place, lateral movement,

etc., are difﬁcult to track by their method. Volunteers

performed simple and regular walking in the experi-

ment, but complex walking patterns were not veriﬁed.

In addition to velocity direction-based methods,

(Cifuentes and Frizera, 2016) proposed a method to

estimate human orientation using an Inertial Measure-

ment Unit (IMU) mounted on the pelvis. The human

position was obtained with a 2D LiDAR mounted at

knee height. They applied the gait cycle to improve

human pose tracking accuracy. Yet, the extra sensor

could cause inconvenience to the user. Certain meth-

ods employed the body shape in a 2D point cloud

to estimate the human orientation. (Shimizu et al.,

2016) took 2D LiDAR data at 36 viewpoints with 10

deg intervals to create the dataset. The human ori-

entation was calculated by comparing the point cloud

data observed in real time with the dataset collected.

A UKF tracker-based human motion information was

integrated to improve the estimation accuracy, called

the shape-motion integration approach. The mean ab-

solute error (MAE) of their method in human orienta-

tion estimation is 6

◦

−12

◦

when the robot was station-

ary, and the human performed simple motion. Simi-

larly, (Glas et al., 2009) assumed that the body shape

in the point cloud is a geometric shape consisting of

three circles to estimate human orientation. But these

methods are too inﬂuenced by body shape and cloth-

ing, etc. (Shorter et al., 2017) found that humans usu-

ally chose the metabolically optimal step width when

walking straight normally through experiments. The

swinging leg is almost in a straight line because the

circumduction requires much effort. Based on their

research, (Yorozu and Takahashi, 2020) hypothesized

that the sagittal axis is parallel to the velocity direc-

tion of the swing leg. A human walking model was

Figure 1: System Setup: (a) Target user to track, (b)

SUMMIT-XL with mecanum wheels, (c) Hokuyo URG-04-

LX-UG01 Laser Rangeﬁnder, (d) Motion capture system,

(e) Reﬂective beads on the square structure, (f) Reﬂective

beads on the mobile robot, (g) Ground markings.

applied to determine the swing leg to track the human

orientation during normal walking. Their experiments

proved that their method is suitable for normal walk-

ing, such as going straight, turning, and U-turns. The

results showed an MAE of 6

◦

−14

◦

in human orienta-

tion. However, when the robot was in motion, the drift

of the robot’s pose would affect the human orientation

estimation. Noteworthy is that almost all of the above

methods do not consider human lateral movement.

In this article, we propose a human pose track-

ing method based on 2D LiDAR for mobile robots.

Our contribution includes three items. First, we im-

plement step-width-adapted human intention zoning

allowing various human motion patterns (including

lateral movement). Second, we integrate a simple

gait model to predict the relative position between the

legs. Finally, we propose a Kalman ﬁlter-based hu-

man pose tracker that addresses the problem of short-

time leg occlusion and data association of legs. Our

article is organized as follows: The section II presents

our platform and the four sub-modules of our system,

the section III shows the experimental results, and ﬁ-

nally ends with a discussion and conclusion.

2 MATERIALS AND METHODS

2.1 Platform and System Overview

We use a high-mobility mobile robot developed by

Robotnik, Summit-XL (as shown in Fig. 1). It is

equipped with mecanum wheels for omnidirectional

movement on ﬂat indoor ﬂoors and can reach a speed

of 3 m/s. Two 2D LiDARs (Hokuyo URG-04-LX-

UG01 Laser Rangeﬁnder) with FOV of 180

◦

each are

ICINCO 2023 - 20th International Conference on Informatics in Control, Automation and Robotics

512

LiDAR

Leg detection

Orientation

estimation

Gait parameters

identification

Frontal human

following

Robot base

controller &

Odometry

Human tracking (10 Hz)

Human following (50 Hz)

Human orientation







Gait parameters





   

Point cloud





 





Legs' position







 





 





 







Velocity command





 



 





Robot pose





 



 





Relative position





 



 





Pose tracking

Human pose





 



 





Figure 2: System Overview Flowchart.

installed on the robot, about 40 cm above the ground.

One faces backward to detect the target user and the

environment behind, and the other faces forward to

detect obstacles ahead. SUMMIT-XL has an inte-

grated PC, enabling the communication between the

modules through the Robot Operating System (ROS)

(Quigley et al., 2009) architecture. The development

focuses on the human pose tracking, so an open en-

vironment with low complexity and few obstacles is

assumed. In addition, a frontal following function is

developed for the SUMMIT-XL, making it strive to

come in front of the user and always face the user.

The robot deals with obstacles through a safety func-

tion that stops the robot when it is close to an obstacle.

As shown in Fig. 2, the human tracking module con-

tains leg detection, human orientation estimation, gait

parameters identiﬁcation and a pose tracker. It oper-

ates at 10 Hz depending on the sampling frequency of

the LiDAR. Then, the human following module plans

the robot’s motion, which is not the main focus of this

paper. The tasks of the human tracking module are

explained below.

Invisible right l

Robot 𝑆

Visible left leg

𝐿

𝑥

𝑦

𝑂

Invisible right leg

𝑅

Invisible

right l

𝑅

𝑜

Figure 3: Leg detection in a 2D point cloud: one user has

both legs visible and the other has one leg occluded.

2.2 Leg Detection

Leg detection is one of the most common methods

(Hasan et al., 2021)(Chung et al., 2011)(Leigh et al.,

2015)(Cifuentes and Frizera, 2016)(Yorozu and Taka-

hashi, 2020) in human tracking. Compared with body

detection, it contains rich gait information. The 2D

LiDAR mounted at the height of human lower limbs

publishes the 2D raw point cloud (Red scattered dots

in Fig. 3). The module then segments the point cloud

into several clusters by distance thresholding and ig-

nores the clusters with less than three total points to

avoid the effect of outliers. Since the contour of the

legs in the point cloud is two adjacent semicircles, the

module adopts nonlinear optimization to detect cir-

cles in these clusters. Three variables that determine

a circle need to be optimized: the circle center coor-

dinates (x,y) and the circle radius r. (x

) represent

the position of each point in the cluster; there are a

total of n points. Given that the radius of a human leg

is around 0.05 m, the optimization problem adds a ra-

dius limit. The nonlinear optimization is solved with

the NLopt library (Johnson et al., 2014), to be precise,

the NLOPT LD MMA algorithm (Svanberg, 2002).

(x,y, r) =

∑

i=1

((x

− x)

+ (y

− y)

− r

)

min

x,y,r

(x,y, r)

s.t. 0.03 < r < 0.07

(1)

The leg detection module excludes some non-

circular clusters and obtains leg candidates. The mod-

ule initializes the pair of candidate legs most proxi-

mate to the robot as the target user. At the next mo-

ment, the search space is determined in the vicinity of

the target user to reduce unnecessary computation.

2D LiDAR-Based Human Pose Tracking for a Mobile Robot

513

𝑥

𝑦

𝜃

𝑚,𝑘

ℎ

𝑥

𝑚,𝑘

𝑙

, 𝑦

𝑚,𝑘

𝑙

𝑥

𝑚,𝑘

𝑟

, 𝑦

𝑚,𝑘

𝑟

𝑥

𝑚,𝑘

𝑟

, 𝑦

𝑚,𝑘

𝑟

෠

𝜃

𝑘−1|𝑘−1

ℎ

𝜃

𝑚,𝑘

ℎ

𝜃

𝑚,𝑘

ℎ

𝜃

𝑚,𝑘

ℎ

𝜃

𝑚,𝑘

ℎ

𝑥

𝑚,𝑘

𝑟

, 𝑦

𝑚,𝑘

𝑟

𝑥

𝑚,𝑘

𝑟

, 𝑦

𝑚,𝑘

𝑟

𝑥

𝑚,𝑘

𝑟

, 𝑦

𝑚,𝑘

𝑟

𝑂

Move-sideways

Go-straight

Turn-in-place

Go-straight-slight-turn

Figure 4: Geometric human orientation estimation in motion intention zones : (Left) Stand still, (Middle) go-straight and

move-sideways, (Right) go-straight-slight-turn and turn-in-place.

2.3 Human Orientation Estimation

We propose a human motion intention zoning to esti-

mate the human orientation with the current legs’ po-

sition (x

m,k

), (x

m,k

), where k is the current

moment, m represents the measurement, and r and l

denote the left and right legs, respectively. The pre-

vious human orientation

k−1|k−1

is used as the base-

line, where h represents the body. As shown in Fig. 4,

the two-foot symbol icon represents legs and W indi-

cates the step width which is calculated by the pro-

jection of the vector between the legs on the human

frontal plane.

Based on the research of (Shorter et al., 2017), an

expanded hypothesis is proposed that the step width

(W = W

) remains almost constant when humans usu-

ally walk. However, the swinging leg is not entirely

in a plane parallel to the human sagittal plane during

walking, nor is the support leg. A speciﬁc interval of

step width is then allowed when going straight, which

forms the go-straight zone. The human orientation

estimation keeps constant in the go-straight zone, the

green area in Fig. 4 (Middle). However, when walk-

ing normally, humans often make slight turns while

maintaining the state of going straight. Slight turns

that do not require great human effort form the go-

straight-slight-turn zone (blue area). Fig. 4 (Right)

explains the geometric human orientation estimation

method in the go-straight-slight-turn zone. Based on

the assumption of constant human step width, we can

deﬁne two parallel lines separated by the constant step

width W

, passing through the two legs. The direction

of these two parallel lines is assumed to determine the

human orientation. In our zoning, the human moves

laterally when the leg swings perpendicular to the hu-

man sagittal plane. Like the go-straight zone, the zon-

ing introduces a move-sideways zone (the light red

area in Fig. 4) where the human orientation estima-

tion remains constant. The ﬁnal yellow area is the

turn-in-place zone. As shown in Fig. 4 (Right), the

human sagittal plane is assumed to be perpendicu-

lar to the vector of the legs. The human orientation

increment in different zones is calculated geometri-

cally, which rotates the intention zoning. The current

pseudo human orientation θ

m,k

(the arrow between

legs) is then estimated as the measurement for the fol-

lowing Kalman Filter (KF).

The human motion intention zoning adapts to

users of various body sizes with different step widths.

The leg detection module automatically collects the

average distance between legs over a few seconds as

the default step width W

for the user when initializ-

ing the target user. Then, the corresponding human

motion intention zoning is auto-generated according

to the default step width. In addition, the move-

sideways and turn-in-place zone are considered not

usual for human motion, where the step width is vari-

able. Thus, the four zones are subdivided into usual

human motion with solid go-straight intention and un-

usual human motion with weak go-straight intention.

This will be employed for the prediction of leg posi-

tion in the human pose tracking.

ICINCO 2023 - 20th International Conference on Informatics in Control, Automation and Robotics

514

2.4 Gait Parameters Identiﬁcation

The left and right legs swing periodically in the usual

human motion zones (green and blue areas). We as-

sume that the projection of the vector between the legs

on the human sagittal plane is a cosine function over

time. Equation (2) introduces the simple gait model

containing four gait parameters, which is sufﬁcient for

predicting the relative position between the legs.

= L

cos(2π f t + ϕ) + b

(2)

denotes the projection of the vector between

the legs on the human sagittal plane, and L

is the

step length (or amplitude of the projection). Ca-

dence f is the rhythm of the human gait, i.e., the fre-

quency with which the legs cross one after the other.

Phase ϕ represents the current state in the gait cy-

cle. Offset b is around zero during the usual walk-

ing. When b suddenly increases or decreases, the hu-

man is out of the gait cycle. Thanks to the NLopt

library (Johnson et al., 2014) and, to be precise, the

NLOPT LN COBYLA algorithm (Powell, 1994), the

module obtains the real-time gait parameters by ﬁtting

the historical data. The objective function is shown in

equation (3).

, f ,ϕ,b) =

∑

j=1

cos(2π f t

+ ϕ) + b − l

)

min

, f ,ϕ,b

, f ,ϕ,b)

s.t. 0.1 < L

< 0.5

0.5 < f < 1.5

0 < ϕ < 2π

0 < b < 0.5

where w

= 1/ j

(3)

N represents the number of historical data frames

used. The optimization uses one gait cycle period,

about 2 seconds, i.e., 20 frames. j represents the

frame number, which is in reverse chronological or-

der. l

and t

denote the actual measurement of the

projection and the actual time at frame j, respectively.

is an inverse quadratic function deﬁned as the

weighting between different moments. The older the

time, the lower the weight. This function gives bet-

ter results than using consistent weights and is more

delay-free for parameter identiﬁcation, but any de-

clining function should work.

2.5 Human Pose Tracking

The pose tracking module is based on a Kalman Fil-

ter with a constant acceleration model. The state vec-

tor P includes the human pose H =





and his ﬁrst and second-order derivatives. The state-

transition equation (4) is shown below.

k−1

+ w

(4)

where

• State vector P





• State-transition model





∆tI

∆t

3,3

∆tI

3,3





Sampling time ∆t = 100ms,

3×3 Identity matrix I

, 3×3 Zero matrix 0

3,3

• Process noise w

∼ N (0, Q

)

We employed a discrete process noise model in

which (X

,θ

) are considered to be independent

of each other so that most of the terms in the matrix

are zero. We assume that the white noise of the

acceleration (

) is zero mean with variance

(σ

,σ

aθ

). In the experiments, the variance pa-

rameters were set as (10

,10

,π

For the measurements, the human position

m,k

) is calculated by averaging the positions of

both legs detected by the leg detection module. The

human orientation θ

m,k

is obtained by the orientation

estimation algorithm f

in section 2.3. The measure-

ment equation (5) is shown below.

+ v

(5)

where

• Measurements

m,k

• Measurement model M



3,3



• Measurement noise v

∼ N (0, C

)

Similarly, the measurements are considered in-

dependent. Hence, the measurement noise ma-

trix C

is diagonal, where (σ

,σ

) is equal to

(0.03

,0.03

,(π/15)

), respectively.

Since leg detection does not label each leg mea-

sured, the data association problem exists when track-

ing the human, i.e., matching the legs’ measurements

of the next moment (x

m,k+1

),(x

m,k+1

)

with the predictions of the current moment

k+1|k

),(x

k+1|k

). The linear model

KF can only predict the human pose and cannot

determine the position of the legs. The human motion

intention zoning, which contains the legs’ spatial

relationship, is adopted for further legs’ predictions.

In zones where humans usually walk with solid

go-straight intention (green and blue areas), the

2D LiDAR-Based Human Pose Tracking for a Mobile Robot

515

gait model is applied to predict the relative position

between legs (∆x

k+1|k

,∆y

k+1|k

). For unusual motion

with weak go-straight intention (light red and yellow

areas), the legs’ predictions are supposed to be

symmetrical concerning the human sagittal plane.

Combining the human pose predicted by KF

k+1|k

the position of both legs at the next moment is

estimated. The Nearest Neighbor (NN) approach is

used to match the two arriving measurements of legs

and to distinguish between the left and right legs.

One leg obstructing the other is a common oc-

clusion problem during human following. The oc-

cluded leg is spatially constrained behind the visible

leg. Since the two legs cannot be far apart, the pos-

sible area of the occluded leg is heuristically deﬁned

as the hidden area behind the visible leg. The green

trapezoidal area, as shown in Fig. 3, is the hidden area

of the right leg. The center of this area R

= (x

)

was deﬁned at one step width W

outside the exten-

sion of the robot S = (X

m,k+1

) and the visible

left leg L = (x

m,k+1

⃗

OS +

⃗

OS + (

∥

⃗

SL∥

+ 1)

⃗

(6)

The right leg prediction R

k+1|k

= ( ˆx

k+1|k

, ˆy

k+1|k

) is

attracted by the center of hidden area R

. The longer

the occlusion duration T

, the more distorted the pre-

diction is and the more conﬁdence from spatial con-

straints. Consequently, the attraction grows with the

rise of occlusion duration. For implementation details

refer to equation (5).

⃗

k+1|k+1

= α

⃗

k+1|k

+ (1 − α)

⃗

where α =

(7)

The conﬁdence of model prediction α gradually

converges to 0 as the occlusion duration T

increases.

In the absence of one measurement, the improved

prediction is used as the system’s measurement input

while increasing the input’s variance.

3 EXPERIMENT AND RESULTS

Twelve volunteers (nine men and three women) par-

ticipated in the experiment. In order to be respectful

of the Helsinki Declaration, they were fully informed

of the data collection and its purpose. They were also

informed that they could leave the experience any-

time. They were also informed that they could ask

for access, modiﬁcation, and deletion of the collected

data anytime. They were having an emergency stop

in the hand in order to avoid any hazard. The exper-

iments were carried out in a room equipped with the

Motion Capture system for ground truth, as shown in

Fig. 1. We had considered mounting reﬂective beads

on the body, but since there were only four cameras in

the room, the beads were easily occluded, and mea-

surements were lost. To avoid occlusion of the beads

for the MoCap system, volunteers hold a sufﬁciently

wide square structure ﬂat against their chests during

experiments. The reﬂective beads were installed on

the structure and mobile robot to capture their trajec-

tories (ground truth) in an inertial frame. Considering

the degree of the waist, the orientation of the upper

body and lower body can be different, such as facing

towards the left front but going forward. Therefore,

during the experiments, we asked the volunteers con-

trol their waist immobility to keep the same upper and

lower body orientation. Each naive volunteer had 30

minutes preparation period to get used to the robot’s

following function and to understand how the robot

works. The volunteers then had another 30 minutes to

try four scenarios:

• Scenario 1: Go straight forward, make a 55

◦

right

turn, and then back up

• Scenario 2: Move sideways to the right, make a

180

◦

turn, and then move sideways to the left

• Scenario 3: Make a 90

◦

right turn and then go

straight forward immediately

• Scenario 4: Perform a random movement

To show the experimental results in detail, the ﬁrst

volunteer is taken as an example. Fig. 5 presents a

comparison of his orientation estimation and ground

truth in the four scenarios. Table 1 contains his orien-

tation tracking MAE in both cases with and without

occlusion processing, and position tracking MAE on

both frontal and sagittal axes in the case with occlu-

sion processing.

In Scenario 1, the ﬁrst volunteer kept his upper

body facing forward and advanced naturally two me-

ters, but a slight rotation of the pelvis was unavoid-

able while walking. Likewise, his orientation oscil-

lated periodically in small amplitudes (≈ 8

◦

) under

MoCap. Our tracking module detected this small am-

plitude oscillation, and the estimated orientation was

somewhat more stable than the actual value due to the

step width interval. The volunteer then turned to the

right (≈ 55

◦

), where the estimation was slightly de-

layed (≈ 0.4 s). However, the overall trend of the

orientation was tracked by our system. Finally, the

volunteer moved backward two meters, similarly, ori-

entation oscillation was observed here as well. In

general, our tracking module maintained good track-

ing, showing an MAE of about 2.3

◦

. Distinguished

ICINCO 2023 - 20th International Conference on Informatics in Control, Automation and Robotics

516

Table 1: Tracking MAE for the ﬁrst volunteer.

Tracking MAE Scenario 1 Scenario 2 Scenario 3 Scenario 4

In orientation (

◦

)

∗

2.3 ± 3.9 4.1 ± 6.8 7.0 ± 9.7 14.0 ± 17.5

In orientation (

◦

) 2.3 ± 3.9 3.2 ± 6.3 4.1 ± 6.7 11.5 ± 13.6

In position on the frontal axis (m) 0.15 ± 0.14 0.26 ± 0.30 0.29 ± 0.23 0.40 ± 0.37

In position on the sagittal axis (m) 0.38 ± 0.59 0.1 ± 0.08 0.27 ± 0.23 0.45 ± 0.42

∗

Only this row corresponds to the tracking module without occlusion processing

Scenario 1

Scenario 2

Scenario 3 Scenario 4

Figure 5: Human tracking performance for the ﬁrst volunteer; Four ﬁgures show the comparison of human orientation under

motion capture and under human tracking module with/without Occlusion Processing (OP).

by motion patterns, the MAE was about 2

◦

in the for-

ward and backward phases and about 7

◦

in the turning

phase.

For Scenario 2, the volunteer held the upper body

forward and shifted one meter to the right, and then he

turned 180

◦

in place in about 2 s. The tracking module

did a delay during the turning phase but kept tracking

and did not reverse the human’s frontal plane (i.e., the

left and right legs were incorrectly correlated). Addi-

tionally, there was a case where one leg was occluded

by the other during the turning phase. For compari-

son, we activated two sets of human tracking modules

at the same time: one without occlusion processing

(blue curve in Fig. 5) and the other with occlusion

processing (red curve). This processing improved the

tracking performance and decreased the overall track-

ing MAE by 0.9

◦

(see Table 1). The tracking pre-

sented an MAE of about 3.2

◦

in scenario 2, about 2

◦

during lateral moves, and about 10

◦

during signiﬁcant

turning.

In Scenario 3, the volunteer turned quickly 90

◦

the right and went straight ahead; the tracking mod-

ule showed a slight delay during the turning phase but

the occlusion processing reduced the delay. In the for-

ward phase, the robot was to the side of the user, so

the right leg was often occluded by the left leg. The

occlusion processing signiﬁcantly reduced the overall

tracking MAE by 2.9

◦

Scenario 4 is challenging, with volunteers mov-

ing randomly in an open room. The tracking module

kept tracking him, but not as well as in the previous

scenarios, with an MAE of about 11.5

◦

. This sce-

nario illustrates the robustness of our system, which

can adapt to most human motion patterns, as well as

random combinations of different patterns.

In general, the experiment with all 12 volunteers

2D LiDAR-Based Human Pose Tracking for a Mobile Robot

517

proceeded satisfactorily, with MAEs ranging in 2

◦

−

◦

for the four scenarios. This also proves that our

system adapted to the users’ step widths.

4 DISCUSSION

The experimental results show that our system ac-

complishes the pose tracking for twelve volunteers.

A slight delay (< 0.6 s) in orientation tracking is no-

ticed when humans make a turn. We think this delay

may be due to human turning habits, where humans

usually turn the upper body before the lower body.

Therefore, the orientation ground truth from the chest

would be a little earlier than our tracking module from

the legs. Although the volunteers were told to keep

the upper and lower body synchronized during the ex-

periment, there was a slight error in the ground truth.

Furthermore, the delay may come from go-straight

and move-sideways intervals. The interval setting

avoids over-sensitivity of the human orientation to

leg position changes and introduces delay simultane-

ously. By adapting to the step width, our system ini-

tially accommodates the differences due to body size

and walking habits. For better tracking performance,

our system requires users not to wear robes, not to

jump above the detection plane, not to stand with legs

crossed, not to run at high speed, etc. Compared to

current state-of-the-art methods (Cifuentes and Friz-

era, 2016)(Nikdel et al., 2021)(Yorozu and Takahashi,

2020), our tracker considers human lateral movement

as well as the occlusion of one leg during walking.

This brings us the advantage of pose tracking accu-

racy. Compared with (Yorozu and Takahashi, 2020),

we estimate orientation using the relative position be-

tween the legs, independent of robot pose drift, which

is especially suitable for highly mobile robots. In

terms of experimental design, unlike most studies that

adopted simple scenarios, we designed complex sce-

narios that encompassed most motion patterns. No-

tably, challenging and adapting to the random walk is

also our distinct advantage.

5 CONCLUSION

In this paper, we developed a robust step-width-

adapted human pose tracker based on 2D LiDAR. We

have dealt with diverse walking patterns, the prob-

lem of self occlusion of legs, and data association

when tracking. We performed a quantitative analysis

of the system’s performance using experimental data

and identiﬁed some existing limitations. The track-

ing function functioned robustly in all scenarios, even

when the user moved randomly.

We will attempt a data-driven method to estimate

human pose in addition to this gait-inspired method.

We expect to expand on the human pose tracker with

new contactless human-robot interaction possibilities,

such as frontal human following, navigation for the

visually impaired, logistics in automated warehouses,

and social robots in shopping malls. Furthermore, im-

proving physical human-robot interaction by utilizing

human pose is also a direction we will work on.

ACKNOWLEDGEMENTS

This work has been partially supported by ROBOTEX

2.0, the French Infrastructure in Robotics under the

grants ROBOTEX (EQUIPEX ANR-10-EQPX-44-

01) and TIRREX (EQUIPEX+ grant ANR-21-ESRE-

0015). Zhenyu Gao was sponsored by the China

Scholarship Council.

REFERENCES

Arras, K. O., Grzonka, S., Luber, M., and Burgard, W.

(2008). Efﬁcient people tracking in laser range data

using a multi-hypothesis leg-tracker with adaptive oc-

clusion probabilities. In 2008 IEEE International

Conference on Robotics and Automation, pages 1710–

1715. IEEE.

c, D., Evans, R. W., Rehm, M., and Kanda, T. (2020).

Using a rotating 3d lidar on a mobile robot for esti-

mation of person’s body angle and gender. Sensors,

20(14):3964.

Chung, W., Kim, H., Yoo, Y., Moon, C.-B., and Park, J.

(2011). The detection and following of human legs

through inductive approaches for a mobile robot with

a single laser range ﬁnder. IEEE transactions on in-

dustrial electronics, 59(8):3156–3166.

Cifuentes, C. A. and Frizera, A. (2016). Human-robot inter-

action strategies for walker-assisted locomotion, vol-

ume 115. Springer.

Glas, D. F., Miyashita, T., Ishiguro, H., and Hagita, N.

(2009). Laser-based tracking of human position and

orientation using parametric shape modeling. Ad-

vanced robotics, 23(4):405–428.

Guerrero-Higueras,

A. M.,

Alvarez-Aparicio, C.,

Calvo Olivera, M. C., Rodr

ıguez-Lera, F. J.,

Fern

andez-Llamas, C., Rico, F. M., and Matell

an, V.

(2019). Tracking people in a mobile robot from 2d

lidar scans using full convolutional neural networks

for security in cluttered environments. Frontiers in

neurorobotics, 12:85.

Hasan, M., Hanawa, J., Goto, R., Fukuda, H., Kuno, Y., and

Kobayashi, Y. (2021). Person tracking using ankle-

level lidar based on enhanced dbscan and optics. IEEJ

ICINCO 2023 - 20th International Conference on Informatics in Control, Automation and Robotics

518

transactions on electrical and electronic engineering,

16(5):778–786.

Ho, D. M., Hu, J.-S., and Wang, J.-J. (2012). Behavior

control of the mobile robot for accompanying in front

of a human. In 2012 IEEE/ASME International Con-

ference on Advanced Intelligent Mechatronics (AIM),

pages 377–382. IEEE.

Hu, J.-S., Wang, J.-J., and Ho, D. M. (2013). Design of

sensing system and anticipative behavior for human

following of mobile robots. IEEE Transactions on In-

dustrial Electronics, 61(4):1916–1927.

Johnson, S. G. et al. (2014). The nlopt nonlinear-

optimization package.

Jung, E.-J., Lee, J. H., Yi, B.-J., Park, J., Noh, S.-T.,

et al. (2013). Development of a laser-range-ﬁnder-

based human tracking and control algorithm for a

marathoner service robot. IEEE/ASME transactions

on mechatronics, 19(6):1963–1976.

Jung, E.-J., Yi, B.-J., et al. (2012). Control algorithms

for a mobile robot tracking a human in front. In

2012 IEEE/RSJ International Conference on Intelli-

gent Robots and Systems, pages 2411–2416. IEEE.

Lee, G., Ohnuma, T., Chong, N. Y., and Lee, S.-G. (2013).

Walking intent-based movement control for jaist ac-

tive robotic walker. IEEE Transactions on Systems,

Man, and Cybernetics: Systems, 44(5):665–672.

Lee, J. H., Tsubouchi, T., Yamamoto, K., and Egawa, S.

(2006). People tracking using a robot in motion with

laser range ﬁnder. In 2006 IEEE/RSJ International

Conference on Intelligent Robots and Systems, pages

2936–2942. Ieee.

Leigh, A., Pineau, J., Olmedo, N., and Zhang, H. (2015).

Person tracking and following with 2d laser scanners.

In 2015 IEEE international conference on robotics

and automation (ICRA), pages 726–733. IEEE.

Mozos, O. M., Kurazume, R., and Hasegawa, T. (2010).

Multi-part people detection using 2d range data. In-

ternational JOURNAL of social robotics, 2:31–40.

Nikdel, P., Shrestha, R., and Vaughan, R. (2018). The

hands-free push-cart: Autonomous following in front

by predicting user trajectory around obstacles. In 2018

IEEE International Conference on Robotics and Au-

tomation (ICRA), pages 4548–4554. IEEE.

Nikdel, P., Vaughan, R., and Chen, M. (2021). Lbgp: Learn-

ing based goal planning for autonomous following

in front. In 2021 IEEE International Conference on

Robotics and Automation (ICRA), pages 3140–3146.

IEEE.

Page, S., Martins, M. M., Saint-Bauzel, L., Santos, C. P.,

and Pasqui, V. (2015). Fast embedded feet pose esti-

mation based on a depth camera for smart walker. In

2015 IEEE International Conference on Robotics and

Automation (ICRA), pages 4224–4229. IEEE.

Powell, M. J. (1994). A direct search optimization method

that models the objective and constraint functions by

linear interpolation. Springer.

Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T.,

Leibs, J., Wheeler, R., Ng, A. Y., et al. (2009). Ros: an

open-source robot operating system. In ICRA work-

shop on open source software, volume 3, page 5.

Kobe, Japan.

Shimizu, M., Koide, K., Ardiyanto, I., Miura, J., and

Oishi, S. (2016). Lidar-based body orientation esti-

mation by integrating shape and motion information.

In 2016 IEEE International Conference on Robotics

and Biomimetics (ROBIO), pages 1948–1953. IEEE.

Shorter, K. A., Wu, A., and Kuo, A. D. (2017). The high

cost of swing leg circumduction during human walk-

ing. Gait & posture, 54:265–270.

Svanberg, K. (2002). A class of globally convergent opti-

mization methods based on conservative convex sepa-

rable approximations. SIAM JOURNAL on optimiza-

tion, 12(2):555–573.

Tamas, L., Popa, M., Lazea, G., Szoke, I., and Majdik, A.

(2010). Lidar and vision based people detection and

tracking. JOURNAL of Control Engineering and Ap-

plied Informatics, 12(2):30–35.

Xiao, A., Tong, W., Yang, L., Zeng, J., Li, Z., and Sreenath,

K. (2021). Robotic guide dog: Leading a human

with leash-guided hybrid physical interaction. In 2021

IEEE International Conference on Robotics and Au-

tomation (ICRA), pages 11470–11476. IEEE.

Yorozu, A. and Takahashi, M. (2020). Estimation of body

direction based on gait for service robot applications.

Robotics and Autonomous Systems, 132:103603.

Zhao, H. and Shibasaki, R. (2005). A novel system

for tracking pedestrians using multiple single-row

laser-range scanners. IEEE Transactions on systems,

man, and cybernetics-Part A: systems and humans,

35(2):283–291.

2D LiDAR-Based Human Pose Tracking for a Mobile Robot

519