Figure 1: Typical low-quality images of road scenes cap-
tured for video surveillance purposes.
process, and thus generates estimates of the dimen-
sions of the vehicle.
2 APPROACH OVERVIEW
The target of the method is the estimation of the di-
mensions of vehicles, which are modeled as rectan-
gular cuboids with width, height and length, in order
to classify them as one of a set of predefined vehicle
classes. The estimation is done for each time instant,
t, based on the previous estimations and the new in-
coming image observations.
The method makes estimations of the posterior
density function p(x
t
|Z
t
), given the complete set of
observations at time t, Z
t
, from which determine the
most probable system state vector, x
t
= (w
t
,h
t
,l
t
)
⊤
,
which models the dimensions of the vehicle. Three
main sources of information need to be available: the
calibration of the camera (including intrinsic and ex-
trinsic parameters, which can be done offline), 2D
image observations of the projection of the volume
onto the road, and prior knowledge of vehicle mod-
els. Therefore, the proposed method applies on any
existing 2D detector, which can be pretty simple,
for instance, in this work we have used a traditional
background-foreground segmentation based on color
and a blob tracking strategy (Kim et al., 2005).
Fig. 2 illustrates an example process that gener-
ates the required information.
The proposedsolution is based on a Markov Chain
Monte Carlo (MCMC) method, which models the
problem as a dynamic system and naturally integrates
the different types of information into a common
mathematical framework. This method requires the
definition of a sampling strategy, and the involved
density functions (namely, the likelihood function and
the prior models). Typically, the complexity of this
kind of sampling strategies are too high to run in real
time. For that reason we have designed our solution
as a fast approximation to MCMC-based MAP meth-
ods using a low number of hypotheses. Next sections
describe the details of all the abovementioned issues
as well as a brief introduction to the MCMC-based
methods.
3 MCMC FRAMEWORK
MCMC methods have been successfully applied
to different nature tracking problems (Barder and
Chateau, 2008; Khan et al., 2005). They can be used
as a tool to obtain maximum a posteriori (MAP) es-
timates provided likelihood and prior models. Ba-
sically, MCMC methods define a Markov chain,
{x
i
t
}
N
s
i=1
, over the space of states, x, such that the sta-
tionary distribution of the chain is equal to the tar-
get posterior distribution p(x
t
|Z
t
). A MAP, or point-
estimate, of the posterior distribution can be then se-
lected as any statistic of the sample set (e.g. sample
mean or robust mean), or as the sample, x
i
t
, with high-
est p(x
i
t
|Z
t
), which will provide the MAP solution to
the estimation problem.
Compared to other typical sampling strategies,
like sequential-sampling particle filters (Isard and
Blake, 1998), MCMC directly sample from the pos-
terior distribution instead of the prior density, which
might be not a good approximation to the optimal im-
portance density, and thus avoid convergence prob-
lems (Arulampalam et al., 2002).
The analytical expression of the posterior density
can be decomposed using the Bayes’ rule as:
p(x
t
|Z
t
) = kp(z
t
|x
t
)p(x
t
|Z
t−1
) (1)
where p(z
t
|x
t
) is the likelihood function that mod-
els how likely the measurement z
t
would be observed
given the system state vector x
t
, and p(x
t
|Z
t−1
) is the
prediction information, since it provides all the in-
formation we know about the current state before the
new observation is available. The constant k is a scale
factor that ensures that the density integrates to one.
We can directly sample from the posterior distri-
bution since we have its approximate analytic expres-
sion (Khan et al., 2005):
p(x
t
|Z
t
) ∝ p(z
t
|x
t
)
N
s
∑
i=1
p(x
t
|x
i
t−1
) (2)
For this purpose we need a sampling strategy,
like the Metropolis-Hastings (MH) algorithm, which
dramatically improves the performance of traditional
particle filters based on importance sampling. As a
summary, the MH generates a new sample according
to an acceptance ratio, that can be written in our case
as:
α =
p(x
j
t
|Z
t
)
p(x
j− 1
t
|Z
t
)
q(x
j− 1
t
|x
j
t
)
q(x
j
t
|x
j− 1
t
)
(3)
where j is the index of the samples of the current
chain. The proposed sample x
j
t
is accepted with prob-
ability min(α,1). If the sample is rejected, the current
state is kept, i.e. x
j
t
= x
j− 1
t
. The proposal density q(x)
VISAPP 2011 - International Conference on Computer Vision Theory and Applications
460