
2.2  Merging and Splitting Procedure  
An important difficulty of visual tracking is when 
objects come to merge from the camera viewpoint. 
This merging situation is immediately detected by 
testing the intersection of predicted object bounding 
box and the size variation of the newly detected 
region. When a merging situation is detected, firstly 
a notion of a temporary group of objects is defined 
in order to track the global region containing 
visually merged objects. Each group is considered as 
a new specific entity to track with its indicators. In 
addition to the temporary group tracking, the 
algorithm attempts to maintain the track of each 
individual object inside the global group region. The 
estimation of position of individual objects during 
the merging situation is based on their appearance 
model. We have chosen the mean shift algorithm. 
This algorithm has been adopted as an efficient 
technique for appearance-based blob tracking 
(Comaniciu
, 2000). The mean-shift algorithm is a 
nonparametric statistical method for seeking the 
nearest mode of a point sample distribution (
Cheng, 
1995). The dissimilarity between the object model 
and the object candidates is performed by the 
Bhattacharya distance. In the normal mode, the 
algorithm stores continuously the latest sub-image of 
each object obtained from the motion detection 
stage. It permits, at the beginning of the merging 
situation, to get ready the initial object model useful 
for the mean shift algorithm. 
 
In some situation, an object may be hidden 
entirely or partially by another one, so that the object 
can not be located by its appearance. In order to 
maintain a tracking continuity in all situations for 
each merged object, the notion of the artificial 
observation is introduced. It is positioned for each 
object, on an estimated location P* which is 
obtained by a weighted combination of the best 
appearance position Pa and the global group position 
Pg.   
                    
g
a
PwwPP ).1(.
*
−+=
                 (2) 
The weight w is a normalised distance between 
the object model and the best location candidate 
obtained from the appearance approach. In this 
strategy, when the object cannot be identified by the 
appearance approach, the group position is favoured. 
 
During merging, the history of each tracked 
object included in a group (positions and indicators) 
is continuously updated by means of its artificial 
observation. This approach of compound 
observation is close to the PDAF algorithm (Bar-
Shalom 1988) which combines the influence of 
multiple candidates in the validation gate. During 
the merging situation the consistency indicator of an 
object is updated only by taking into account the 
speed stability, estimated thanks to its artificial 
observation. If during tracking a sole object joins a 
group, the initial group with updated objects is 
maintained. The global procedure during the 
merging situation may be relatively time-consuming. 
So the decision of group creation is robustly 
validated once it has been predicted by using the 
proximity of consistent tracked objects. It permits to 
tolerate efficiently some fugitive false-detections. 
On the other hand, this last strategy cannot initialise 
a group if one of the merged objects is newly created 
and has a low consistency indicator. In this last case 
one object is only considered and no group is 
created.  
 
As in the merging situation, the splitting has to be 
detected immediately. The Splitting situation is 
detected once a new object is detected close to a 
temporary group region. In order to reduce the 
influence of some detection errors when a real sole 
object or a group is split by the motion segmentation 
errors during a short delay, the decision of splitting 
is provided only when it is confirmed during a fixed 
time delay (1 second in this implementation). 
 
When the algorithm detects split regions 
associated to a known group, a specific procedure 
focuses its attention toward the identity of objects. 
After splitting, the group updates its individual 
object. When the group is reduced to a known sole 
object, the group entity is destroyed. 
 
A visual comparison between objects before 
merging and after splitting permits to affect the best 
object identity for each region. When a region 
considered as a sole object splits, separated detected 
regions are associated with new objects and inherit 
the history of the initial object (previous tracks 
position and consistency indicators).  
3 CENTRAL LEVEL TRACKING 
From a multi-sensor organization, we have chosen 
the hierarchical approach. A centralized filter 
combines the results of the local tracking filters 
(sensor level) and then performs the track 
management (Fig. 2). The sensor validation 
indicator is computed at the predictive location of 
ICINCO 2006 - ROBOTICS AND AUTOMATION
574