6 CONCLUDING REMARKS
We have presented a novel pose estimation method
based on constructing tree structures from skele-
tonised sequences of visual hulls. The trees are
pruned, segmented into body parts, and the extrem-
ities are identified. This is intended to be a real-
time approach for pose estimation, the results for
the pose tree computation back this up and demon-
strate good labellings across multiple sequences with
complex motion. The approach can recover from er-
rors or degeneracies in the initial volume/skeletal re-
construction which overcomes inherent limitatons of
many tracking approaches which cannot re-initialise.
Ground-truth evaluation on synthetic data indicates
correct extremity labelling in ∼ 95% of frames with
rms errors < 5 cm.
ACKNOWLEDGEMENTS
The authors wish to thank Lars M. Eliassen for help-
ing with the implementation of the skeletonisation
algorithm, and Odd Erik Gundersen for his helpful
comments during the writing of the paper. Some of
the data used in this project was obtained from mo-
cap.cs.cmu.edu. The CMU database was created with
funding from NSF EIA-0196217.
REFERENCES
Bertrand, G. and Couprie, M. (2006). A New 3D Parallel
Thinning Scheme Based on Critical Kernels. Discrete
Geometry for Computer Imagery (LNCS), 4245:580–
591.
Blum, H. (1967). A transformation for extracting new de-
scriptors of shape. Models for the perception of speech
and visual form, 19(5):362–380.
Brostow, G. J., Essa, I., Steedly, D., and Kwatra, V. (2004).
Novel skeletal representation for articulated creatures.
Computer Vision - ECCV (LNCS), 3023:66–78.
Caillette, F., Galata, A., and Howard, T. (2008). Real-time
3-D human body tracking using learnt models of be-
haviour. Computer Vision and Image Understanding,
109(2):112–125.
Chen, Y.-l. and Chai, J. (2009). 3D Reconstruction
of Human Motion and Skeleton from Uncalibrated
Monocular Video. Computer Vision - ACCV (LNCS),
5994:71–82.
Chu, C.-W., Jenkins, O. C., and Mataric, M. J. (2003).
Markerless Kinematic Model and Motion Capture
from Volume Sequences. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recogni-
tion, pages 475–482.
Cornea, N. D., Silver, D., and Min, P. (2007). Curve-
Skeleton Properties, Applications, and Algorithms.
IEEE Transactions on Visualization and Computer
Graphics, 13(3):530–548.
Fauske, E., Eliassen, L. M., and Bakken, R. H. (2009).
A Comparison of Learning Based Background Sub-
traction Techniques Implemented in CUDA. In Pro-
ceedings of the First Norwegian Artificial Intelligence
Symposium, pages 181–192.
Gkalelis, N., Kim, H., Hilton, A., Nikolaidis, N., and Pitas,
I. (2009). The i3DPost multi-view and 3D human ac-
tion/interaction database. In Proceedings of the Con-
ference for Visual Media Production, pages 159–168.
Laurentini, A. (1994). The Visual Hull Concept for
Silhouette-Based Image Understanding. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence,
16(2):150–162.
Menier, C., Boyer, E., and Raffin, B. (2006). 3D Skeleton-
Based Body Pose Recovery. In Proceedings of the
Third International Symposium on 3D Data Process-
ing, Visualization, and Transmission, pages 389–396.
Michoud, B., Guillou, E., and Bouakaz, S. (2007). Real-
time and markerless 3D human motion capture us-
ing multiple views. Human Motion - Understanding,
Modeling, Capture and Animation (LNCS), 4814:88–
103.
Moeslund, T. B., Hilton, A., and Krüger, V. (2006). A sur-
vey of advances in vision-based human motion cap-
ture and analysis. Computer Vision and Image Under-
standing, 104:90–126.
Moschini, D. and Fusiello, A. (2009). Tracking Human
Motion with Multiple Cameras Using an Articulated
Model. Computer Vision/Computer Graphics Collab-
oration Techniques (LNCS), 5496:1–12.
Poppe, R. (2007). Vision-based human motion analysis: An
overview. Computer Vision and Image Understand-
ing, 108(1-2):4–18.
Raynal, B., Couprie, M., and Nozick, V. (2010). Generic
Initialization for Motion Capture from 3D Shape. Im-
age Analysis and Recognition (LNCS), 6111:306–315.
Starck, J., Maki, A., Nobuhara, S., Hilton, A., and Mat-
suyama, T. (2009). The Multiple-Camera 3-D Pro-
duction Studio. IEEE Transactions on Circuits and
Systems for Video Technology, 19(6):856–869.
Sundaresan, A. and Chellappa, R. (2008). Model-driven
segmentation of articulating humans in Laplacian
Eigenspace. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 30(10):1771–1785.
Svensson, S., Nyström, I., and Sanniti di Baja, G. (2002).
Curve skeletonization of surface-like objects in 3D
images guided by voxel classification. Pattern Recog-
nition Letters, 23:1419–1426.
Theobalt, C., de Aguiar, E., Magnor, M. A., Theisel, H., and
Seidel, H.-P. (2004). Marker-free kinematic skeleton
estimation from sequences of volume data. Proceed-
ings of the ACM symposium on Virtual reality software
and technology - VRST ’04, D:57.
VISAPP 2012 - International Conference on Computer Vision Theory and Applications
190