frames, constructing and filtering a global pool of can-
didate transformations. In this pool, the final, single
solution is determined by identifying the density peak
in the space of rigid transformations. The distance
metric used is derived from the knowledge of the en-
tire dataset undergoing transformation – in our case,
it is the set of all points from all input frames.
The algorithm delivers robust results, even when
applied to noisy data acquired by current consumer-
grade depth sensors. Specifically, we have used the
algorithm to align four sequences captured with a pair
of Microsoft Kinect for Azure devices. In each in-
stance, the resulting transformation closely matched
the expected result, offering visually superior results
compared to aligning the data based on the relative
placement information of the input devices.
In the future, we intend to explore more advanced
local shape descriptors than those used in this work.
Enhancing our understanding of local shape matching
could result in improved candidate transformation fil-
tering, leading to a faster and more reliable algorithm.
A reference implementation of the proposed
registration tool is available for download at
This work was supported by the project 20-02154S
of the Czech Science Foundation. Nat
alie V
and Jakub Frank were partially supported by the Uni-
versity specific research project SGS-2022-015, New
Methods for Medical, Spatial and Communication
Data. The work was partially carried out as part of
the study ”Virtual reality in the physiotherapy of mul-
tiple sclerosis” supported by GAUK 202322.
