5 CONCLUSIONS
We have presented a moving person detection and
tracking system. As tracking by a moving observer
is a difficult task, we combined 3D algorithms with
2D descriptors and tracking algorithms. The system
allows a moving observer and moving objects. Be-
cause we use MSaM, we obtain 3D information on
the scene, observer motion, and object motion.
By combining different components, we gain a
mutual benefit. By combining the HOG with the
MSaM tracker, we get 3D information of the person
motion and eliminate false postive HOG detections.
By feeding back the Meanshift tracking, we can har-
vest additional features on the object for improved
MSaM performance. Our system deals with 3D and
2D information. As we know the 3D depth and the
position in the image-plane, we can speed up HOG
(fewer pyramid levels, image subarea validation).
Extensions to other categories are possible. The
system is not limited to a human shape descriptor. In-
troducing different descriptors, the system can track
different (or even multiple) categories.
REFERENCES
Comaniciu, D. and Meer, P. (2002). Mean shift: A ro-
bust approach toward feature space analysis. PAMI,
24:603–619.
Comaniciu, D., Ramesh, V., and Meer, P. (2003). Kernel-
based object tracking. PAMI, 25:564–577.
Costeira, J. and Kanade, T. (1995). A multi-body factor-
ization method for motion analysis. In ICCV, pages
1071–1076.
Costeira, J. P. and Kanade, T. (1998). A multibody fac-
torization method for independently moving objects.
IJCV, 29:159–179.
Dalal, N. and Triggs, B. (2005). Histogram of oriented gra-
dients for human detection. In CVPR.
Dalal, N., Triggs, B., and Schmid, C. (2006). Human de-
tection using oriented histograms of flow and appear-
ance. In ECCV.
Ess, A., Leibe, B., Schindler, K., and van Gool, L. (2008). A
mobile vision system for robust multi-person tracking.
In CVPR.
Felzenszwalb, P. F., McAllester, D., and Ramanan, D.
(2008). A discriminatively trained, mulitscale, de-
formable part model. In CVPR.
Fitzgibbon, A. W. and Zisserman, A. (2000). Multibody
structure and motion: 3-d reconstruction of indepe-
nently moving objects. In ECCV.
Holzer, P. and Pinz, A. (2010). Mobile surveillance by 3d-
outlier analysis. In ACCV Visual Surveillance Work-
shop.
Hu, W., Tan, T., Wang, L., and Maybank, S. (2004). A
survey on visual surveillance of object motion and be-
haviors. Trans. on Systems, Man, and Cybernetics,
34:334–352.
Leibe, B., Schindler, K., Cornelis, N., and Gool, L. V.
(2008). Coupled object detection and tracking from
static cameras and moving vehicles. PAMI, 30:1683–
1698.
Li, T., Kallem, V., Singaraju, D., and Vidal, R. (2007). Pro-
jective factorization of multiple rigid-body motions.
In CVPR.
Lin, Z. and Davis, L. S. (2010). Shape-based human detec-
tion and segmentation via hierarchical part-template
matching. In PAMI.
Lopez, D. M., Sappa, A. D., and Graf, T. (2010). Survey
of pedestrian detection for advanced driver assistance
systems. PAMI, 32:1239–1258.
Ozden, K., Schindler, K., and Gool, L. V. (2010). Multibody
structure-from-motion in practice. PAMI, 32:1134–
1141.
Schindler, K., Suter, D., and Wang, H. (2008). A
model-selection framework for multibody structure-
and-motion of image sequences. IJCV, 79:159–177.
Song, Y., Feng, X., and Perona, P. (2000). Towards detec-
tion of human motion. In CVPR.
Yan, J. and Pollefeys, M. (2006). A general framework for
motion segmentation: Independent, articulated, rigid,
non-rigid, degenerate and non-degenerate. In ECCV,
pages 94–106.
Yan, J. and Pollefeys, M. (2008). A factorization based
approach for articulated nonrigid shape, motion, and
kinematic chain recovery from video. PAMI, 30:865–
887.
VISAPP 2011 - International Conference on Computer Vision Theory and Applications
568