
we aim to further improve accuracy using MRI and
CT images in addition to gait images. We will verify
the usefulness of this method for separating individ-
ual features for other tasks, such as facial expression
recognition. In this study, we conducted experiments
using VAE and CNN to verify the effectiveness of our
feature separation method. Furthermore, since our
feature separation method can be applied to various
backbones, we plan to apply it to more tasks using
existing networks.
REFERENCES
Abdulhay, E., Arunkumar, N., Narasimhan, K., Vellaiap-
pan, E., and Venkatraman, V. (2018). Gait and tremor
investigation using machine learning techniques for
the diagnosis of parkinson disease. Future Genera-
tion Computer Systems, 83:366–373.
Ding, Z., Xu, Y., Xu, W., Parmar, G., Yang, Y., Welling,
M., and Tu, Z. (2020). Guided variational autoen-
coder for disentanglement learning. In Proceedings
of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, pages 7920–7929.
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N.,
Tzeng, E., and Darrell, T. (2014). Decaf: A deep con-
volutional activation feature for generic visual recog-
nition. In International conference on machine learn-
ing, pages 647–655. PMLR.
Gong, K., Gao, Y., Liang, X., Shen, X., Wang, M., and
Lin, L. (2019). Graphonomy: Universal human pars-
ing via graph transfer learning. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition, pages 7450–7459.
Han, J. and Bhanu, B. (2005). Individual recognition us-
ing gait energy image. IEEE transactions on pattern
analysis and machine intelligence, 28(2):316–322.
Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X.,
Botvinick, M., Mohamed, S., and Lerchner, A. (2016).
beta-vae: Learning basic visual concepts with a con-
strained variational framework. In International con-
ference on learning representations.
Kidzi
´
nski, Ł., Yang, B., Hicks, J. L., Rajagopal, A., Delp,
S. L., and Schwartz, M. H. (2020). Deep neural
networks enable quantitative movement analysis us-
ing single-camera videos. Nature communications,
11(1):4054.
Kingma, D. P. and Ba, J. (2014). Adam: A
method for stochastic optimization. arXiv preprint
arXiv:1412.6980.
Kingma, D. P. and Welling, M. (2013). Auto-encoding vari-
ational bayes. arXiv preprint arXiv:1312.6114.
Liao, R., Moriwaki, K., Makihara, Y., Muramatsu, D.,
Takemura, N., and Yagi, Y. (2021). Health indi-
cator estimation by video-based gait analysis. IE-
ICE TRANSACTIONS on Information and Systems,
104(10):1678–1690.
Nguyen, T.-N., Huynh, H.-H., and Meunier, J. (2016).
Skeleton-based abnormal gait detection. Sensors,
16(11):1792.
Phillips, P. J., Moon, H., Rizvi, S. A., and Rauss, P. J.
(2000). The feret evaluation methodology for face-
recognition algorithms. IEEE Transactions on pat-
tern analysis and machine intelligence, 22(10):1090–
1104.
Sakata, A., Takemura, N., and Yagi, Y. (2019). Gait-based
age estimation using multi-stage convolutional neural
network. IPSJ Transactions on Computer Vision and
Applications, 11:1–10.
Shiori Furukawa, N. T. (2024). Disease estimation based
on gait images by separating individual features us-
ing variational autoencoder. In AROB-ISBC-SWARM
2024.
Tahir, N. M. and Manap, H. H. (2012). Parkinson dis-
ease gait classification based on machine learning ap-
proach. Journal of Applied Sciences(Faisalabad),
12(2):180–185.
Takemura, N., Makihara, Y., Muramatsu, D., Echigo, T.,
and Yagi, Y. (2018). Multi-view large population gait
dataset and its performance evaluation for cross-view
gait recognition. IPSJ transactions on Computer Vi-
sion and Applications, 10:1–14.
Tran, L., Yin, X., and Liu, X. (2017). Disentangled repre-
sentation learning gan for pose-invariant face recogni-
tion. In Proceedings of the IEEE Conference on Com-
puter Vision and Pattern Recognition (CVPR).
Wu, J.-C., Ko, C.-C., Yen, Y.-S., Huang, W.-C., Chen, Y.-
C., Liu, L., Tu, T.-H., Lo, S.-S., and Cheng, H. (2013).
Epidemiology of cervical spondylotic myelopathy and
its risk of causing spinal cord injury: a national cohort
study. Neurosurgical focus, 35(1):E10.
Disease Estimation Using Gait Videos by Separating Individual Features Based on Disentangled Representation Learning
925