By using the depth data to adaptively scale the target
size, we proved that the tracker can resist to signifi-
cant radial motions with good accuracy based on the
Jaccard index. Moreover, we presented a simple way
to extend the RGB-T tracker presented in (Talha and
Stolkin, 2012) to RGB-D-T, by using a histogram of
3D normals as depth descriptor. Although the depth
feature we used did not significantly improve the ac-
curacy of the tracker in the tested video sequences,
we believe it could improve its robustness in other
more complicated sequences involving the interaction
of several persons.
In this work, we modelled the target model using
a single histogram for each data source. An inter-
esting extension would be to use a multi-part model,
and investigate how to efficiently compute histogram
descriptors for each target using part specific fusion
schemes. Finally, we the usage of a depth descriptor
based on local shape information such as curvature
distributions, instead of 3D normals, could add addi-
tional robustness to human deformations.
REFERENCES
Bouguet, J.-Y. (2004). Camera calibration toolbox for mat-
lab.
Choi, C. and Christensen, H. I. (2013). Rgb-d object track-
ing: A particle filter approach on gpu. In Intelligent
Robots and Systems (IROS), 2013 IEEE/RSJ Interna-
tional Conference on, pages 1084–1091. IEEE.
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D.,
and Burgard, W. (2012). An evaluation of the rgb-
d slam system. In Robotics and Automation (ICRA),
2012 IEEE International Conference on.
Everingham, M., Van Gool, L., Williams, C. K., Winn, J.,
and Zisserman, A. (2010). The pascal visual object
classes (voc) challenge. International journal of com-
puter vision, 88(2):303–338.
Jafari, O. H., Mitzel, D., and Leibe, B. (2014). Real-time
rgb-d based people detection and tracking for mobile
robots and head-worn cameras. In Robotics and Au-
tomation (ICRA), 2014 IEEE International Confer-
ence on, pages 5636–5643. IEEE.
Kumar, S., Marks, T. K., and Jones, M. (2014). Improv-
ing person tracking using an inexpensive thermal in-
frared sensor. In Computer Vision and Pattern Recog-
nition Workshops (CVPRW), 2014 IEEE Conference
on, pages 217–224. IEEE.
Luber, M., Spinello, L., and Arras, K. O. (2011). People
tracking in rgb-d data with on-line boosted target mod-
els. In Proc. of The International Conference on Intel-
ligent Robots and Systems (IROS).
Matsumoto, K., Nakagawa, W., Saito, H., Sugimoto, M.,
Shibata, T., and Yachida, S. (2015). Ar visual-
ization of thermal 3d model by hand-held cameras.
In Proceedings of the 10th International Conference
on Computer Vision Theory and Applications, pages
480–487.
Mogelmose, A., Bahnsen, C., Moeslund, T. B., Clap
´
es,
A., and Escalera, S. (2013). Tri-modal person re-
identification with rgb, depth and thermal features. In
Computer Vision and Pattern Recognition Workshops
(CVPRW), 2013 IEEE Conference on, pages 301–307.
IEEE.
Nakagawa, W., Matsumoto, K., de Sorbier, F., Sugimoto,
M., Saito, H., Senda, S., Shibata, T., and Iketani,
A. (2014). Visualization of temperature change us-
ing rgb-d camera and thermal camera. In Com-
puter Vision-ECCV 2014 Workshops, pages 386–400.
Springer.
Nummiaro, K., Koller-Meier, E., and Van Gool, L. (2002).
Object tracking with an adaptive color-based parti-
cle filter. In Pattern Recognition, pages 353–360.
Springer.
Nummiaro, K., Koller-Meier, E., and Van Gool, L. (2003).
An adaptive color-based particle filter. Image and vi-
sion computing, 21(1):99–110.
P
´
erez, P., Hue, C., Vermaak, J., and Gangnet, M. (2002).
Color-based probabilistic tracking. In Computer vi-
sionECCV 2002, pages 661–675. Springer.
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio,
M., Moore, R., Kipman, A., and Blake, A. (2011).
Real-time human pose recognition in parts from sin-
gle depth images. In Proceedings of the 2011 IEEE
Conference on Computer Vision and Pattern Recogni-
tion.
Stolkin, R., Rees, D., Talha, M., and Florescu, I. (2012).
Bayesian fusion of thermal and visible spectra camera
data for region based tracking with rapid background
adaptation. In Multisensor Fusion and Integration for
Intelligent Systems (MFI), 2012 IEEE Conference on,
pages 192–199. IEEE.
Susperregi, L., Mart
´
ınez-Otzeta, J. M., Ansuategui, A.,
Ibarguren, A., and Sierra, B. (2013). Rgb-d, laser and
thermal sensor fusion for people following in a mobile
robot. Int. J. Adv. Robot. Syst.
Talha, M. and Stolkin, R. (2012). Adaptive fusion of infra-
red and visible spectra camera data for particle filter
tracking of moving targets. In Sensors, 2012 IEEE,
pages 1–4. IEEE.
Vemulapalli, R., Arrate, F., and Chellappa, R. (2014). Hu-
man action recognition by representing 3d skeletons
as points in a lie group. In Computer Vision and Pat-
tern Recognition (CVPR), 2014 IEEE Conference on.
Vidas, S., Lakemond, R., Denman, S., Fookes, C., Sridha-
ran, S., and Wark, T. (2012). A mask-based approach
for the geometric calibration of thermal-infrared cam-
eras. Instrumentation and Measurement, IEEE Trans-
actions on, 61(6):1625–1635.
Vidas, S., Moghadam, P., and Bosse, M. (2013). 3d ther-
mal mapping of building interiors using an rgb-d and
thermal camera. In Robotics and Automation (ICRA),
2013 IEEE International Conference on, pages 2311–
2318. IEEE.
RGB-D and Thermal Sensor Fusion - Application in Person Tracking
619