Ghost Pruning for People Localization in Overlapping Multicamera

Systems

Muhammad Owais Mehmood

, Sebastien Ambellouis

and Catherine Achard

LEOST, French Institute of Science and Technology for Transport, Spatial Planning, Development and Networks,

Villeneuve D’Ascq, France

Institute for Intelligent Systems and Robotics, Universite Pierre et Marie Curie, Paris, France

Keywords:

People Localization, Ghost Pruning, Multicamera Surveillance.

Abstract:

In this paper, we propose a novel ghost pruning technique for multicamera people localization in overlapping

scenarios. First, synergy map is obtained from multiplanar projections across multiple overlapping cameras.

Second, occupancy map is generated by back projection from the synergy map across various image layers.

This back projected occupancy map is combined with constraints to remove ghosts. The novelty of this paper

is the introduction of an intuitive ghost pruning technique, which does not require any temporal information.

Experiments on a sequence of the PETS 2009 dataset show signiﬁcant reduction in the number of ghosts. The

purpose and novelty of this paper is focused to the ghost pruning module but detection metrics show results

comparable to those of the complete, state-of-the-art multicamera object detection systems.

1 INTRODUCTION

Advances in the imaging technology have enabled

ubiquitous presence of cameras allowing multicamera

setups to be used in scenarios like visual surveillance,

3D reconstruction or human modeling. As humans

are an integral part of the environment, their localiza-

tion, tracking and analysis is an important aspect of

research. Signiﬁcant research has been done on single

camera people localization but the approach remains

restricted for example at small scales, under occlusion

(Dollar et al., 2012). Multicamera localization is thus

a straightforward extension in order to improve the

localization accuracy and robustness.

Fusion of information across multiple views is a

challenging aspect of multicamera systems requiring

some level of consistency across all the views of the

object of interest, including its presence or not. Ho-

mography and planar projection techniques provide

a reasonable degree of success when addressed to

solve this issue (Eshel and Moses, 2010; Fleuret et al.,

2007; Khan and Shah, 2009). However, the multiview

homography constraint suffers from false detections

known as “ghosts”. Here it is important to point out

that even if temporal information and tracking tech-

niques are known to reduce ghosts, this papers tries to

solve the ghost problem at its fundamental and uses

only spatial information. The technique introduced

in this paper is simple, intuitive yet effective because

the detections are veriﬁed across the multiple camera

views by using back projected occupancies. Hence,

the novelty of this paper is to perform effective ghost

pruning without temporal information.

The remainder of this paper proceeds with a dis-

cussion on the state-of-the-art methods. Section 3 in-

troduces the concepts of multiplanar projections and

synergy map. Section 4 addresses the back projected

occupancies. Experimental setup and the results on

PETS 2009 dataset along with comparisons are pre-

sented in Section 5 with conclusion in the end.

2 RELATED WORK

Since the last few decades, people localization has

been an active area of research. There has been a lot

of development in the single camera localization, a

recent literature survey is (Dollar et al., 2012). How-

ever, as summarized by Dollar et al., single camera

algorithms are limited in terms of scale, handling oc-

clusion and for performance in the dense and cluttered

environments. Introducing tracking, as presented in

(Yilmaz et al., 2006), may alleviate the problem but

it does not completely solve it. Compared to sin-

gle camera, multicamera systems can inherently pro-

vide more information and thus have been exploited

632

Mehmood M., Ambellouis S. and Achard C..

Ghost Pruning for People Localization in Overlapping Multicamera Systems.

DOI: 10.5220/0004741306320639

In Proceedings of the 9th International Conference on Computer Vision Theory and Applications (VISAPP-2014), pages 632-639

ISBN: 978-989-758-004-8

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

for this problem. This section speciﬁcally focuses on

the multicamera surveillance applications for overlap-

ping cameras and literature based on the ghost prun-

ing methods.

Registration of an object present across multiple

camera views can be used to estimate its location.

One common approach in these systems constraints

the search space to the ground plane using the planar

world assumption (Eshel and Moses, 2010; Fleuret

et al., 2007; Khan and Shah, 2009). Therefore, as-

suming that the objects do not ﬂoat in the air, pla-

nar homographies are calculated for the ground plane.

Recent approaches extend this by using multiplanar

homographies combined with the ground plane but

this is not robust for several reasons such as the bad

foreground detections or the occlusion of the lower

part of the body.

Khan and Shah introduce the planar homographic

constraint at multiple planes and combine it with

graph cut segmentation to track people (Khan and

Shah, 2009). No calibration information is required

but planar references must be present in at least one

of the views and afﬁne homography must be manually

computed by the user for each sequence. Their pro-

posed solution suffers from false positives or ghosts

due to the limitations of the homography constraint.

Khan and Shah account for ghosts using the space-

time occupancies. Eshel and Moses perform peo-

ple tracking in a dense, crowded environment using

homography constraints at the top layers combined

with the pixel intensity correlation and motion direc-

tion, velocity constraints (Eshel and Moses, 2010).

This method requires the use of partial calibration

data. Temporal information is used to reduce phan-

toms. But, the algorithm is limited to those sequences

in which heads are visible in a top view conﬁgura-

tion. Different from the ﬁrst two techniques, Fleuret

et al. deﬁne a probabilistic occupancy map based on a

quantized ground plane along with a distance measure

in relation to the multiview projections (Fleuret et al.,

2007). They further integrate it with Hidden Markov

Model (HMM) for joint color, motion and occupancy

modeling to perform tracking. However, this algo-

rithm is limited to tracking up to a maximum of six

people, performs poorly in dense situations and fails

to account for height variations like the detection of

children. More recently, Utasi and Benedek introduce

novel features, a 3D conﬁguration model and its opti-

mization in order to perform multicamera people de-

tection (Utasi and Benedek, 2011; Utasi and Benedek,

2012).

In parallel with the complete detection or track-

ing systems, research also focuses on resolving more

fundamental issues such as ghosts. Ren et al. deﬁne

ghosts as the false positives due to the intersections of

non-corresponding regions (Ren et al., 2012). They

propose to use color template matching for ghost

pruning. But, as we will show later, their method

is unable to account for views with high variations

in the color constancy. Moreover, their equations are

limited to only two views. Unlike (Ren et al., 2012),

our proposed algorithm has no limitation in the num-

ber of views, number of planes used and is able to

account for views which lack color constancy. Evans

et al. introduced a suppression map technique which

is able to predict the possible location of the ghosts

based on the scene geometry but it requires prior in-

formation about the location of the objects of interest

which is obtained from the previous frames (Evans

et al., 2012). Unlike this method, our proposed tech-

nique does not require any temporal information.

The novelty of our method is to perform ghost

pruning without using temporal information. We also

account for color constancy variations and our algo-

rithm can work across more than two camera views

by taking into account the planes at several heights

of the body, not just the top. Our algorithm has been

tested on the City Center sequence of the PETS 2009

dataset using three overlapping camera views (PETS,

2009). The results show signiﬁcant reduction in the

number of ghosts, including a comparison with (Ren

et al., 2012). Besides this, we achieve detection rates

which are better than the Probability Occupancy Map

(POM) detector module of (Fleuret et al., 2007) and

results comparable to one of the more recent multi-

camera people detector in (Utasi and Benedek, 2012).

3 MULTIPLANAR PROJECTIONS

AND SYNERGY MAP

The multiplanar projection algorithm as proposed in

(Utasi and Benedek, 2012) is used. The inputs for the

algorithm are the foreground masks F

(x,y) of each

view v. Instead of using the Mixture of Gaussians

(MoG), as employed by Utasi and Benedek, our fore-

ground masks are obtained using the more robust mul-

tilayer background subtraction method as proposed in

(Yao and Odobez, 2007). Next, the multiplanar pro-

jections are used to create the synergy map as ex-

plained in the following two sections.

3.1 Multiplanar Projections

The camera calibration model is used to project the

silhouettes obtained by background subtraction to

the ground plane and the planes parallel to it. If

GhostPruningforPeopleLocalizationinOverlappingMulticameraSystems

633

) represent the ground coordinates of any cam-

era placed at height h

then as presented by (Utasi and

Benedek, 2012):

= x

− (x

− x

)h/h

= y

− (y

− y

)h/h

(1)

where, (x

) are the ground coordinates of an

arbitrary point and (x

) are the coordinates of the

same point projected at a height h parallel to the

ground.

If camera calibration model or homography infor-

mation is not available then the method proposed by

Khan and Shah in section 4.2 of their paper can be

used (Khan and Shah, 2009). Figure 1 illustrates the

projections obtained on a randomly selected frame of

the PETS 2009 sequence.

(a) (b) (c)

Figure 1: Multiplanar projections on a frame of PETS 2009.

Projections at (a) ground (b) 100 cm (c) 190 cm.

3.2 Synergy Map

Khan and Shah deﬁne synergy map as the 2D grid of

object occupancy likelihoods (Khan and Shah, 2009).

Synergy map is obtained by the fusion of the multi-

planar projections obtained in the last section. Let us

assume that P

v,p

(x,y) is the map corresponding to the

projection of the camera view v ∈V on the plane p ∈ P

and n

is the total number of views. Then, the synergy

map S(x, y) and normalized synergy map S

(x,y) are

generated as follows:

S(x,y) =

∑

∏

v,p

(x,y)

(x,y) =

S(x,y)

max(S(x,y))

(2)

For the rest of the paper we will use the normal-

ized synergy map and refer to it as S(x,y) or simply

S. Figure 2 shows an example of the synergy map

generated.

Figure 2: Illustration of a normalized synergy map obtained

by using Equation 2. Brighter red colors indicate higher

probabilities. The blue dots represent the ground truth that

is the location of the people. The green dots represent the

ghosts.

(a) (b)

Figure 3: Illustration of ghosting phenomenon in an arbi-

trary scenario. Here, three real objects are known as a pri-

ori. Black boundary represents a physical area. Red, blue

and cyan are lines drawn from 3 cameras to the known ob-

jects. Green points are the accurate detections and yellow

points are ghosts. (a) illustrates ghost generation using two

cameras. (b) is generated using an additional camera with a

limited Field of View (FOV).

4 BACK PROJECTED

OCCUPANCY MAPPING

Object detection from the synergy map is possible by

identifying the peaks. However, as shown in Figure 2,

some false peaks are also present. A simple illustra-

tion in Figure 3 shows how the intersection of a line

from the camera center to the object of interest gener-

ates a ghost detection, a concept discussed in detail in

(Evans et al., 2012). Moreover, any errors in the cam-

era calibration, time synchronization and background

subtraction stage is likely to generation ghosts, a re-

alistic scenario in the current systems. As we are fo-

cusing on localization therefore we further constraint

ourselves not to use prior knowledge about the num-

ber of people or their location.

Our idea is simple and intuitive. We propose

to verify if an object is present at each high proba-

bility location of the 2D synergy map by intersect-

ing its back projected occupancy map, denoted by

Moc

v,h

(x,y), and its corresponding foreground mask

(x,y). Because a person can be occluded in one

VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications

634

view therefore the back projected occupancy map is

computed across several views v. Moreover, to take

into account the inaccurate foreground detections, the

back projected occupancy map calculations are done

with respect to a set H of planes including the ground

plane and the planes parallel to the ground at different

heights h.

The back projected occupancy map is calculated

as follows. First, the high probability locations map

(HPL) is computed by thresholding the 2D synergy

map. Second, a segmentation step is applied to HPL

to yield an image containing regions that may corre-

spond to people. This segmented synergy map is back

projected on each view v leading to several occupancy

maps Moc

v,h

(x,y). Two back projected occupancy

maps are illustrated in Figure 4. All the occupancy

maps are studied to ﬁnally decide for each region of

the segmented synergy map if it is a ghost or a per-

son. These regions are further constrained to account

for the size of minimum human projection in the syn-

ergy map, Area

T H

. This threshold can be calculated a

priori by the camera setups and calibration.

Moreover, as it is difﬁcult to properly ﬁx a thresh-

old to binarize the synergy map, we use hysteresis

thresholding and obtain two sets of regions: R

and

. R

is the set of regions obtained with zero or no

threshold and R

is the set of regions obtained at a

higher threshold denoted by T H

. Some quick oper-

ations are performed to clean up the map regions R

when the region bounding boxes overlap more than

60%, only the largest region is retained. The map

regions R

is often over-segmented due to the high

threshold T H

but the corresponding regions in R

are not affected by this. So, if multiple regions are

detected as real objects in R

, in a close spatial prox-

imity, and if all of them correspond to a single region

in the set R

, then, only one detection is retained at

the location of the region in R

Now, we have to ﬁnd a criterion to decide if a

region r belonging to R

corresponds to a real ob-

ject or to a ghost. This criterion, similar to an oc-

cupancy rate, uses the generated back projected occu-

pancy maps and is deﬁned by:

r,v,h

∑

x∈X

∑

y∈Y

(x,y)Moc

v,h

(x,y)

∑

x∈X

∑

y∈Y

Moc

v,h

(x,y)

(3)

where F

(x,y) is the foreground mask in the im-

age view v, X and Y are the coordinates of the pixels

belonging to the projection of the region r in the view

v and h is the assumed height of the synergy map ac-

cording to the ground plane.

And the region r is detected as a real object if,

∀v ∈ V, ∃h ∈ H, O

r,v,h

> T (h) (4)

So, there is an assumed height of the synergy map

such as the region r of the synergy map corresponds

to binary detections in all of the views v.

The parameter T (h) is slightly increased with the

height, accounting for the assumption that if a pro-

jection is missing in one view for an assumed height

then, there is a higher probability that it is also miss-

ing in the other heights, and if it is present then T (h)

must be higher as well. The speciﬁc combinations of

heights, views and the values of the thresholds are dis-

cussed in the next section. The process is summarized

in Algorithm 4.1.

Algorithm 4.1: Back Projected Occupancy Mapping.

1: clean up the map regions R

2: for ∀ h ∈ H do

3: compute the occupancy maps for each view v,

Moc

v,h

(x,y)

4: for ∀ r ∈ R

5: compute the criterion O

r,v,h

in each view

6: if ∀ v, O

r,v,h

> T (h) & Area(r) > Area

T H

then

7: add region r to the detection list and

remove region r from R

8: end if

9: end for

10: end for

11: remove over-segmentation using R

5 EXPERIMENTAL RESULTS

For the evaluation of our algorithm, we used a public

sequence City Center of PETS 2009 dataset (PETS,

2009). For comparison purposes, the dataset, ground

truth, and annotations are processed in the same fash-

ion as described by (Utasi and Benedek, 2012). The

evaluation sequence is an outdoor dataset contain-

ing 400 frames across three views with a large FOV

(View 001, View 002, View 003). As shown in Fig-

ure 7, the Area of Interest (AOI) used is of the size

12.2m × 14.9m and is selected on the basis of visi-

bility from all the views. There are a maximum of 8

people present simultaneously in the scene with cases

of occlusion and cluttering. It is also important to

mention the inaccuracies in time synchronization and

camera calibration. Even if they are not signiﬁcant

they cause noticeably different projections in the three

views.

For evaluation, we use the metrics introduced by

(Utasi and Benedek, 2012):

• The Missed Detections Rate (MDR) corresponds

to the cases where no detection is assigned to the

ground truth.

GhostPruningforPeopleLocalizationinOverlappingMulticameraSystems

635

• The False Detections Rate (FDR) corresponds to

the occurrences when a detection is assigned to an

area without ground truth.

• The Multiple Instances Rate (MIR) corresponds

to the situations where multiple detections match

with the same ground truth area.

• The Total Error Rate (TER) is a combination of

the last three measures deﬁned by TER = MDR +

MIR + FDR.

All of these evaluations are performed in the 2D

image domain. These measures are expressed in per-

cent of the number of all objects annotated in the

dataset. The ground truth takes into account the inac-

curacies due to time synchronization and calibration

inaccuracies and is relaxed accordingly. Similarly, it

is also adjusted for the ambiguities present at the bor-

ders of the AOI. Therefore, this popular sequence is

challenging in many aspects.

The synergy map is generated using all the planes,

p ∈ P, between 100 to 150cm, at a distance of 1 cm

each. We use this speciﬁc combination, because the

torso region is present in all views without undergo-

ing too many occlusions. Moreover, as the heights

of people can vary and the system has to be robust,

the top layers are not used. For back projection, we

use two heights h: 0cm and 5cm. Figure 4 illustrates

a case of varying information present across the two

different heights. For PETS 2009, we ﬁnd this selec-

tion of two heights as sufﬁcient with respect to the

pruning efﬁciency and computational performance of

the algorithm, as the computation performance is neg-

atively impacted with the increase in the number of

heights. The synergy map is calculated from all three

views. We decided not to use the View 003 for back

projection step because (1) this view suffers from a

signiﬁcant perspective effect, and (2) the AOI in this

view is represented by lower number of pixels than in

the two other views. With high perspective effect, the

resolution of projected area might not be sufﬁcient for

the threshold requirement of Equation 4.

The other parameters to set include B (threshold

for background subtraction), T H

(threshold to bina-

rize the synergy map), Area

T H

(a threshold on the hu-

man size in the synergy map) and T (h) used for the

occupancy rate. For background subtraction, we use

the default parameters of the algorithm suggested by

(Yao and Odobez, 2007). The output is a probabil-

ity map that is the likelihood of each pixel to belong

to the foreground. This map is binarized using the

threshold B. For comparison purposes, TH

and B

are optimized in a fashion similar to that of (Utasi

and Benedek, 2012). For T (h), the criteria is to as-

sign a low value for the lowest height and gradually

increase it for the subsequent height, as discussed in

Table 1: Ghost pruning performance of the proposed algo-

rithm and comparison with the two color template matching

algorithms for ghost pruning proposed by Ren (Ren et al.,

2012). All parameters have been optimized so that the TER

is minimized.

Method TER FDR MDR MIR

No Pruning 0.362 0.291 0.047 0.024

Pruning 0.127 0.035 0.092 0.000

Ren Eq. (6) 0.877 0.197 0.678 0.002

Ren Eq. (7) 0.885 0.202 0.681 0.002

the previous section. The precision of this low value

is not of paramount importance because the informa-

tion is complemented with the other heights. Thus,

we use T (0) = 0.05 and T (5cm) = 0.1. The param-

eter Area

T H

is set to 25. For the multiplanar projec-

tions (synergy map), a constant 2 cm grid resolution

is used. The parameters B and T H

are optimized for

minimizing the TER. Figure 6(b) shows that the opti-

mal value for T H

is 0.8. This is a good compromise

between a loss of detection and overdetection due to

the ﬁrst threshold set to 0. Figure 6(a) shows that

the best foreground masks are obtained with B = 0.1.

This result is conﬁrmed by visually looking at the

foreground images, with almost no foreground for

B > 0.4.

For ghost pruning comparisons with (Ren et al.,

2012), both of the color feature equations were im-

plemented. Their algorithm is limited to two camera

views so it was tested on View 001 and View 002; be-

sides, any color comparison with View 003 will give

signiﬁcantly bad results due to the lack of color con-

stancy. The same threshold (a ghost has at least three

times greater joint likelihood than that of a real ob-

ject) and three planes in the torso region as suggested

by (Ren et al., 2012) are used. The planes used are:

100 cm, 124 cm and 149 cm. It was not possible to

test our algorithm on their dataset as it is not public.

(a) (b)

Figure 4: Back projected occupancy maps for a frame of

PETS 2009. (a) Generated at T (0cm), (a) Generated at

T (5cm). These occupancy maps are obtained for View 001

and T H

= 0.8. Notice the difference in the foreground

masks at the two heights.

VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications

636

Table 2: Performance of the proposed ghost pruning algo-

rithm compared to the full detection systems: POM (Fleuret

et al., 2007) and Utasi (Utasi and Benedek, 2012). All pa-

rameters have been optimized so that the TER is minimized.

Method TER FDR MDR MIR

POM 0.205 0.150 0.055 0.000

Utasi 0.107 0.014 0.087 0.006

Proposed 0.127 0.035 0.092 0.000

Proposed +

Boundary

0.100 0.021 0.079 0.000

Similarly, a comparison with (Evans et al., 2012) is

not possible since they use tracking information.

Table 1 shows the ghost pruning performance of

our algorithm. There is a signiﬁcant decrease in the

number of false detections or the number of ghosts.

The FDR improves by 25.6% and TER improves

by 23.5% with a negligible increase in MDR. The

proposed algorithm also performs signiﬁcantly better

than (Ren et al., 2012). We believe the failure of Ren

algorithm is due to (a) the template matching that does

not take into account the difference in the geometry of

the scene and planar projections, more speciﬁcally in

comparison to a less challenging Ren’s dataset (b) the

threshold selection criteria that is difﬁcult to ﬁx.

Next, we compared our proposed ghost pruning

algorithm to the complete detection systems of (Utasi

and Benedek, 2012) and (Fleuret et al., 2007) as in

Table 2. Our results show signiﬁcant improvement

over (Fleuret et al., 2007) and a comparable perfor-

mance to that of (Utasi and Benedek, 2012). The rea-

son we don’t surpass Utasi algorithm is the absence

of a 3D model and its subsequent parameter optimiza-

tion. Figure 5 shows a frame among many in which

a detection is identiﬁed as false, also giving a missed

detection, even though it is present on the boundary

of the ground truth. Utasi designs the 3D model and

ground truth for a detection to be precisely in the cen-

ter of the person’s body. If we relax the evaluation

metric to include the boundary of the ground truth in

the calculation of the different detection rates then, as

Figure 5: Illustration of a case in which a false detection

is present on the boundary box. False detection is indi-

cated by a cross (the estimated projected position) without

a rectangle. The thin white rectangle represents the missed

detections. Notice that the false detection also introduces

a missed detection. Image is cropped for illustration pur-

poses.

(a)

(b)

Figure 6: Curves between optimized algorithm parameters

and the evaluation metrics where (a) Curve for the bina-

rization threshold B, generated for a constant T H

= 0.8

(b) Curve for the threshold T H

, generated for a constant

B = 0.1.

in the last row of Table 2, we can show that our algo-

rithm, without any 3D model, can give, not only com-

parable, but identical and a little better performance

to Utasi in terms of TER.

6 CONCLUSIONS

In this paper, we have presented a technique for ghost

pruning in the context of people localization. First,

multiplanar projections are generated to produce a

synergy map. The synergy map is then complemented

with the proposed back projected occupancy mapping

technique. Experiments performed on a sequence of

PETS 2009 dataset show signiﬁcant reduction in the

number of ghosts and improvement over other meth-

ods without the use of temporal information. The de-

tection metrics also show results comparable to the

state-of-the-art on complete detection schemes. We

further propose to introduce our ghost pruning mod-

ule in a complete detection system, for example with

a 3D model or in a tracking system.

GhostPruningforPeopleLocalizationinOverlappingMulticameraSystems

637

(a)

(b)

(c)

Figure 7: Detections in the three camera views of the City Center sequence of PETS 2009 dataset. (a) View 001 (b) View 002

(c) View 003. The black rectangle indicates a correct detection, the boundary of which is the ground truth and circle deﬁnes

the estimated projected location. All people are correctly detected in all views of this frame including those in (c) despite

occlusion and clutter. The AOI is illustrated by a black rectangle with gray outline. Images are cropped for illustration

purposes.

REFERENCES

Dollar, P., Wojek, C., Schiele, B., and Perona, P. (2012).

Pedestrian detection: An evaluation of the state of

the art. IEEE Trans. Pattern Anal. Mach. Intell.,

34(4):743–761.

Eshel, R. and Moses, Y. (2010). Tracking in a dense

crowd using multiple cameras. Int. J. Comput. Vision,

88(1):129–143.

Evans, M., Li, L., and Ferryman, J. M. (2012). Suppression

of detection ghosts in homography based pedestrian

detection. In AVSS, pages 31–36. IEEE Computer So-

ciety.

Fleuret, F., Berclaz, J., Lengagne, R., and Fua, P. (2007). P.:

Multi-camera people tracking with a probabilistic oc-

cupancy map. IEEE Transactions on Pattern Analysis

and Machine Intelligence.

Khan, S. and Shah, M. (2009). Tracking multiple occluding

people by localizing on multiple scene planes. Pat-

tern Analysis and Machine Intelligence, IEEE Trans-

actions on, 31(3):505–519.

PETS (2009). Pets dataset: Performance evaluation of

tracking and surveillance. http://www.cvg.rdg.ac.uk/

PETS2009/a.html. [Online].

Ren, J., Xu, M., and Smith, J. S. (2012). Pruning phantom

detections from multiview foreground intersection. In

ICIP, pages 1025–1028.

Utasi,

A. and Benedek, C. (2011). A 3-D marked point pro-

VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications

638

cess model for multi-view people detection. In Pro-

ceedings of The IEEE Conference on Computer Vision

and Pattern Recognition, pages 3385–3392, Colorado

Springs, CO, USA.

Utasi,

A. and Benedek, C. (2012). A bayesian approach

on people localization in multi-camera systems. IEEE

Transactions on Circuits and Systems for Video Tech-

nology.

Yao, J. and Odobez, J. (2007). Multi-layer background sub-

traction based on color and texture. In IEEE Con-

ference on Computer Vision and Pattern Recognition,

2007. CVPR’07, pages 1–8.

Yilmaz, A., Javed, O., and Shah, M. (2006). Object track-

ing: A survey. ACM Comput. Surv., 38(4).

GhostPruningforPeopleLocalizationinOverlappingMulticameraSystems

639