8 CONCLUSIONS
This paper presented a low-cost solution for human
detection in large infrastructures while preserving
people identity. Real time performance is achieved by
using a small set of simple features. It was presented
a real scenario in which multiple depth cameras are
simultaneously used to monitor the environment. The
method uses the merged data from the cameras and
finds candidates by segmenting the resulting 3D point
cloud. For each candidate, a set of features is ex-
tracted. Several subsets of features were tested to as-
sess their performance when used as input to a classi-
fier. The proposed classifier lies on features with low
computational cost and achieves good performance in
a real time scenario.
As future work, it would be interesting to explore the
creation of confidence regions on the FOV of each
camera to account for the accuracy degradation with
the distance.
ACKNOWLEDGEMENTS
The authors would like to thank Jo
˜
ao Mira, Thales
Portugal S.A. and ANA Aeroportos de Portugal for
enabling the data acquisition within the SMART-er
project and Susana Brand
˜
ao by providing the ESF
MATLAB wrapper.
REFERENCES
Arras, K. O., Mozos,
´
O. M., and Burgard, W. (2007). Us-
ing boosted features for the detection of people in 2d
range data. In IEEE ICRA, pages 3402–3407. IEEE.
Bondi, E., Seidenari, L., Bagdanov, A. D., and Del Bimbo,
A. (2014). Real-time people counting from depth im-
agery of crowded environments. In Advanced Video
and Signal Based Surveillance (AVSS), 11th IEEE In-
ternational Conference on, pages 337–342. IEEE.
Breiman, L. (2001). Random forests. Machine learning,
45(1):5–32.
Brscic, D., Kanda, T., Ikeda, T., and Miyashita, T. (2013).
Person tracking in large public spaces using 3-d range
sensors. Human-Machine Systems, IEEE Transac-
tions on, 43(6):522–534.
Choi, B., Meric¸li, C., Biswas, J., and Veloso, M. (2013).
Fast human detection for indoor mobile robots us-
ing depth images. In IEEE ICRA, pages 1108–1113.
IEEE.
Cortes, C. and Vapnik, V. (1995). Support-vector networks.
Machine learning, 20(3):273–297.
Dalal, N. and Triggs, B. (2005). Histograms of oriented
gradients for human detection. In IEEE CVPR Com-
puter Society Conference on, volume 1, pages 886–
893. IEEE.
Ess, A., Leibe, B., Schindler, K., and Van Gool, L. (2009).
Moving obstacle detection in highly dynamic scenes.
In IEEE ICRA, pages 56–63. IEEE.
Hastie, T., Tibshirani, R., and Friedman, J. (2009). The
elements of statistical learning.
Hegger, F., Hochgeschwender, N., Kraetzschmar, G. K.,
and Ploeger, P. G. (2013). People detection in 3d
point clouds using local surface normals. In RoboCup
2012: Robot Soccer World Cup XVI, pages 154–165.
Springer.
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013).
An introduction to statistical learning. Springer.
Lin, W.-C., Sun, S.-W., and Cheng, W.-H. (2013). Demo
paper: A depth-based crowded heads detection sys-
tem through a freely-located camera. In IEEE ICME
Workshops, pages 1–2. IEEE.
Liu, J., Liu, Y., Zhang, G., Zhu, P., and Chen, Y. Q. (2015).
Detecting and tracking people in real time with rgb-d
camera. Pattern Recognition Letters, 53:16–23.
Mikolajczyk, K., Schmid, C., and Zisserman, A. (2004).
Human detection based on a probabilistic assembly
of robust part detectors. In Computer Vision-ECCV,
pages 69–82. Springer.
Mitzel, D. and Leibe, B. (2011). Real-time multi-person
tracking with detector assisted structure propagation.
In IEEE ICCV Workshops, pages 974–981. IEEE.
Moeslund, T. B., Hilton, A., and Kr
¨
uger, V. (2006). A sur-
vey of advances in vision-based human motion cap-
ture and analysis. Computer vision and image under-
standing, 104(2):90–126.
Munaro, M., Basso, F., and Menegatti, E. (2012). Tracking
people within groups with rgb-d data. In IEEE/RSJ
IROS International Conference on, pages 2101–2107.
IEEE.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer,
P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,
A., Cournapeau, D., Brucher, M., Perrot, M., and
Duchesnay, E. (2011). Scikit-learn: Machine learning
in Python. Journal of Machine Learning Research,
12:2825–2830.
Rusu, R. B. and Cousins, S. (2011). 3D is here: Point Cloud
Library (PCL). In IEEE ICRA, Shanghai, China.
Spinello, L. and Arras, K. O. (2011). People detection in
rgb-d data. In IEEE/RSJ IROS International Confer-
ence on, pages 3838–3843. IEEE.
Wohlkinger, W. and Vincze, M. (2011). Ensemble of shape
functions for 3d object classification. In IEEE RO-
BIO International Conference on, pages 2987–2992.
IEEE.
Xia, L., Chen, C.-C., and Aggarwal, J. K. (2011). Human
detection using depth information by kinect. In IEEE
CVPR Workshops Computer Society Conference on,
pages 15–22. IEEE.
Zhu, Q., Yeh, M.-C., Cheng, K.-T., and Avidan, S. (2006).
Fast human detection using a cascade of histograms of
oriented gradients. In IEEE CVPR Computer Society
Conference on, volume 2, pages 1491–1498. IEEE.
Detecting People in Large Crowded Spaces using 3D Data from Multiple Cameras
225