the contrary when using the framework, each false-
positive detection is tracked in the subsequent frames.
That means if an object is detected repeatedly but a
matching was not successfully, the object is tracked
with more than one bounding boxes. Even if all
bounding boxes follow the object correctly the cardi-
nality error increases, which results in a high METE.
But as one can see, a low error is reached for a low
threshold t
s
. But the threshold t
o
has to have a high
value to achieve a low METE. That means the color
histogram is more suitable for object merging, than
the overlap of bounding boxes. Another influence on
the error is the mutual occlusion of objects. The mean
shift algorithm is not able to follow a hidden object,
instead it converges to false positions.
5 CONCLUSION
In this paper we presented a framework for the detec-
tion and tracking of objects. The framework consists
of three stages. For each stage an individual algo-
rithm is applied. The stages are concatenated in a
way that they exchange information about the pres-
ence and the position of objects. An algorithm ana-
lyzing the compressed video stream is used as a pre-
selection step to provide a binary mask, which seg-
ments the regions of an image into foreground and
background. We selected the Implicit Shape Model
as algorithm to actually find the position of objects.
A tracking algorithm using the mean shift algorithm
was established to track the detected objects. The
novelties lie in the concatenation of algorithms ana-
lyzing the video sequence in the compressed domain
and pixel domain. Another novelty is the method of
object segmentation to receive a color histogram, as
it is needed for the mean shift algorithm. The evalu-
ation results state good results in object segmentation
and tracking when using the new method. It is also
shown that the complexity could be reduced signifi-
cantly. Another challenge is multiple person tracking
and mutual occlusion of persons. This could be han-
dled with previous knowledge like evaluation of the
individual trajectory for example.
ACKNOWLEDGEMENTS
The research leading to these results has received
funding from EIT ICT Labs’ Action Line “Future
Cloud” under activity n
o
11882.
REFERENCES
Andriluka, M., Roth, S., and Schiele, B. (2008).
People-Tracking-by-Detection and People-Detection-
by-Tracking. In Proc. 2008 IEEE Conf. on Computer
Vision and Pattern Recognition (CVPR), pages 1–8.
Berclaz, J., Fleuret, F., Turetken, E., and Fua, P. (2011).
Multiple Object Tracking using K-Shortest Paths Op-
timization. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 33(9):1806–1819.
Comaniciu, D., Ramesh, V., and Meer, P. (2003). Kernel-
based Object Tracking. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 25(5):564–577.
Corrales, J., Gil, P., Candelas, F., and Torres, F. (2009).
Tracking based on Hue-Saturation Features with a
Miniaturized Active Vision System. In Proc. 40th Int.
Symposium on Robotics, pages 107–112.
Eiselein, V., Fradi, H., Keller, I., Sikora, T., and Dugelay, J.-
L. (2013). Enhancing Human Detection Using Crowd
Density Measures and an Adaptive Correction Filter.
In Proc. 2013 10th IEEE Int. Conf. on Advanced Video
and Signal Based Surveillance (AVSS), pages 19–24.
Evans, M., Osborne, C., and Ferryman, J. (2013). Mul-
ticamera Object Detection and Tracking with Object
Size Estimation. In Proc. 2013 10th IEEE Int. Conf.
on Advanced Video and Signal Based Surveillance
(AVSS), pages 177–182.
Kailath, T. (1967). The Divergence and Bhattacharyya Dis-
tance Measures in Signal Selection. IEEE Transac-
tions on Communication Technology, 15(1):52–60.
Laumer, M., Amon, P., Hutter, A., and Kaup, A. (2013).
Compressed Domain Moving Object Detection Based
on H.264/AVC Macroblock Types. In Proc. of the
International Conference on Computer Vision Theory
and Applications (VISAPP), pages 219–228.
Leibe, B., Leonardis, A., and Bernt, S. (2004). Combined
Object Categorization and Segmentation With an Im-
plicit Shape Model. In Proc. Workshop on Statisti-
cal Learning in Computer Vision (ECCV workshop),
pages 17–32.
Lowe, D. G. (2004). Distinctive Image Features from Scale-
Invariant Keypoints. International Journal of Com-
puter Vision, 60:91–110.
MPEG (2010). ISO/IEC 14496-10:2010 - Coding of Audio-
Visual Objects - Part 10: Advanced Video Coding.
Nawaz, T., Poiesi, F., and Cavallaro, A. (2014). Measures
of Effective Video Tracking. IEEE Transactions on
Image Processing, 23(1):376–388.
Poppe, C., De Bruyne, S., Paridaens, T., Lambert, P., and
Van de Walle, R. (2009). Moving Object Detec-
tion in the H.264/AVC Compressed Domain for Video
Surveillance Applications. Journal of Visual Commu-
nication and Image Representation, 20(6):428–437.
Senst, T., Eiselein, V., and Sikora, T. (2012). Robust Local
Optical Flow for Feature Tracking. IEEE Transac-
tions on Circuits and Systems for Video Technology,
22(9):1377–1387.
Yilmaz, A., Javed, O., and Shah, M. (2006). Object Track-
ing: A Survey. ACM Computing Surveys, 38(4).
HybridPersonDetectionandTrackinginH.264/AVCVideoStreams
485