Table 1: Eleven Variables could be considered in designing
the scenarios that contain fingers’ motion.
Exp Variables
Do the fingers move A/Synchronously?
Do they move orderly: Adjacent/Apart? Is the order Right-to-Left?
If failed, do we Srch Adj Fings Only? Is fingers’ Motion Full-Cycle?
If not, do they Conserve their Direction? Is there Hand-lvl Open-Closing?
Do we consider Previous Direction? If yes, is it Initialized?
No. of Prtl-Cycled Finger’s Dynamics? 1, 2, or 3?
No. of End-Cycled Fing’s Dynamics? 1, 2, or 3?
ror (JPos3D) calculated as the normalized sum of esti-
mated joints distances from the ground truth. Last is
the Accuracy (Acc). We attend to highly constrained
cases. Therefore, we indicate the amount of accuracy
only if it is not 100%. The search algorithm knows
the previous state of fingers in the sequence.
To animate the fingers, we employ a similar hi-
erarchical hand database as proposed by (Dadgar
and Brunnett, 2018). They Define their hierarchical
database on various layers of complexity. That en-
ables us to animate hand limbs individually with a
specific emphasize on the layer of interest (e.g., fin-
gers). Its finger’s layer (e.g., L
4
) is further parti-
tionable into five sublayers (one for each finger) with
modifiable step degree (e.g., resolution). These prop-
erties makes the database a suitable choice to exam-
ine the uniqueness of our pose-descriptor on different
resolutions, refine it with different paths, and consider
various variables to design specific scenarios.
We create a sequence of postures for each
finger based on the temporal evolution of each
finger specified for every experiment using a
Viterbi-like algorithm (Viterbi, 1967). Re-
turning S = {S
n
|n = 1,2, 3, 4, 5} sequences (where
1 ≡ Little, 2 ≡ Ring,3 ≡ Middle, 4 ≡ Index, 5 ≡ T humb) of
Q = {q
1
, q
2
, ..., q
m
} states, where m ≈ 2000 is a usual
practice in this work. After selecting a specific global
orientation, we retrieve the input sequences. Finally,
the OpenCV’s contour extraction method (Bradski,
2000) is applied to the input frames to extract the
contours when searching for the optimal posture. All
experiments employ the previous direction for the
search. An elaborated version of the definitions and
their evaluations are in the following subsections:
Experiment1. In this experiment, we consider all dif-
ferent digit combinations of fingers, including their
transitions (see Fig8). The fingers start at closed-pose,
and one by one (thus asynchronously), from the lit-
tle finger, each of them opens and stays in the open
pose (thus full-cycled on the finger level and having
hand-level cycles) until all fingers open from right to
left (therefore orderly or adjacent). After one-by-one
closing, the next opening cycle starts from the next
Figure 8: A sample of inputs for experiment one.
Figure 9: A sample of inputs for experiment two.
(e.g., ring) finger. For this scenario, considering the
path
1
where each finger has 21 possible poses from
close to open, we have 2100 poses. For the first frame,
we do not use the initialization (so the number of dy-
namics is 2
5
). Nevertheless, for the end poses, at
open or close cycles, the re-initialization information
is known (thus, the number of possible dynamics for
each finger is 1).
Experiment2. For this experiment, the collective free
motion of fingers from close-posture to open (and re-
verse) is under investigation (see Fig9). Here, other
fingers do not have to wait at their end states for one
finger to reach its closed or opened states (thus there is
no hand-level cycle). Identical to the previous exper-
iment, the hand starts in the closed pose and not one-
by-one (yet still asynchronously) opens from the little
finger. Each finger opens till the end (thus full-cycled)
until all fingers get opened from right to left (therefore
adjacent). Considering the 21 poses of path
1
, we have
1840 frames. Similarly, for the first frame we do not
employ the initialization (thus the number of dynam-
ics is 2
5
). For the end poses, at each cycles, also, the
re-initialization is known.
Experiment3. Here, we consider collective free mo-
tion (so there is no hand-level cycle) from close-
posture to open (and reverse). Analogous to the previ-
ous experiment, the hand starts in the closed pose and
asynchronously opens its fingers from the little one.
However, unlike in two other experiments, each finger
does not fully open until the end (thus partial-cycled)
until all fingers reach their end-pose from right to left
(therefore adjacent). For this scenario, considering
the path
1
, we have 2230 poses. We do not use the
initialization for the first frame. However, the number
of dynamics for that initial frame is 3
5
since the mo-
tion starts not at the beginning of the cycle. Here, for
the end cycles, at open or close postures, the initial-
ization information is available (thus, the number of
possible dynamics for each finger is 1).
Evaluation. Using our pose-descriptor in the one-D
temporal model enables us to achieve a real-time esti-
mation, as shown in Table 2. In all experiments, aver-
age output frame rates are above 31 fps. These output
ICPRAM 2023 - 12th International Conference on Pattern Recognition Applications and Methods
506