In this paper, we proposed a new method for sign
language recognition that processes time-domain in-
formation on the frequency-domain by representing
videos as 3D amplitude tensors using the 3D Fast
Fourier Transform (3D-FFT) and effectively compar-
ing them in the Product Grassmann Manifold (PGM).
Focusing only on the amplitude spectrum, we ob-
tain features robust to time deviations. Furthermore,
PGM can effectively represent and compare the ten-
sor structures as subspaces generated from each ten-
sor mode while preserving the temporal information
due to the unfolding operation. Therefore, we estab-
lished a simple yet powerful subspace representation
that considers temporal information. Experimental
results showed that our method can significantly im-
prove performance over other subspace-based meth-
ods. In the future, we are interested in verifying the
efficacy of our method in other action recognition
This work was supported by JSPS KAKENHI Grant
Number 21K18481. The authors thank the Tsukuba
University of Technology students for their help in
collecting our TNSD data set.
Sign Language Recognition Based on Subspace Representations in the Spatio-Temporal Frequency Domain