Figure.2.(a) demonstrates the results of Mean Shift
tracking, which shows one tracker drift to another
when the two persons occlude each other.
Figure.2.(b) gives the results of our method, which
can maintain tracking and correct identities although
the two persons are similar in color and shape.
4.2 Target Model Update in Serious
Occlusion Situation
This experiment aims to analyse the multi-target
traking performance for serious occlusion. The
computational complexity and tracking accuracy of
our method and some existed methods are
compared. Here, we used a public test sequence
named “ThreePastShop2cor.mpg” and the relative
ground truth file obtained from website (Http://,
2004). The image sizes are
384 288×
pixels, 25
frames per second. Three persons move in this
video. We initialize three persons on frame No.400
with colored rectangle boxes according to the
ground truth file. The left person is labeled A with
blue box, the middle person is labeled B with green
box and the right person labeled C with red box. In
this video, sometimes A and B, A and C are
seriously occluded, and A and C have same color,
which is difficult to maintain correct tracking.
First, we compare TSHT method (Maggio,
Cavallaro, 2005) with our method. TSHT embeds
Mean Shift into particle filter without target model
update process. We assign 1000 particle samples for
each target for TSHT. To compare the tracking
performance on the same baseline, we use same
initial target model and dynamic model. Figure3.(a)
gives the results of TSHT method with 1000
particles for each target, which shows several errors
especially when the two targets are seriously
occluded. From the results, we can see that the
appearance of target is changed largely when it is
occluded by other objects. But TSHT method still
uses the initial target model to track, which makes
tracker A drift to C in the second resulted image and
C drift to B in the fourth resulted image. In this
situation, assigning more particles for each target
cannot maintain correct tracking. Figure 3.(b) shows
the results of our method MSPFU, which can
maintain correct tracking although the two persons
are seriously occluded and the two persons have
similar color and shape. Our method uses only 40
particles for each target in initialization stage.
Second, as for the computational complexity, we
compare our method MSPFU with the PF. We add
target model update process for PF method. In our
experiment, PF method use at least 100 particles for
each target to maintain correct tracking and the
average processing time for each frame is 722 ms.
Our tracking system MSPFU uses only 40 particles
for each target in initialization stage and the average
processing time for each frame is 436 ms, which can
track multi-target quickly.
In above experiments, we compare the
computational complexity in tracking multiple
targets in terms of the number of particles. To
compare the tracking accuracy, we consider the
accuracy of target center position. The distance
between the real tracked center position of target and
the ground truth is calculated to measure the
tracking accuracy for PF and MSPFU.
(a) Results from TSHT.
(b) Results from Our method MSPFU.
Figure 3: Comparison of tracking performance between
TSHT and our method PFMSU on a public open video.
Figure4 shows the accuracy results of tracking
accuracy with 40 particles for each target from
frame No.400 to frame No.700 every four frames.
The horizontal axis means the No. of frame and the
vertical axis means the distance. The smaller the
distance is, the higher the tracking accuracy is.
Method MSPFU can track multi-target correctly
with small distance from the ground truth, while
method PF is failure in tracking target A and target
C during this sequence. Figure5 gives the results of
tracking accuracy with 100 particles for each target.
We can see both of MSPFU and PF can track multi-
target correctly here, while method MSPFU has
better tracking accuracy on target B than method PF.
We also give the average distance on the test
frames. Table1 is average distance result of tracking,
where PN means particle number, TPF means
processing time per frame, AD_A means average
distance of target A and CT_A means if target A is
correctly tracked or not. From the result of table1,
we can see using more particles will take more
ROBUST MULTI-TARGET TRACKING USING MEAN SHIFT AND PARTICLE FILTER WITH TARGET MODEL
UPDATE
609