To compute the KL distance on those values, we nor-
malize p
i
and q
i
as following
ˆp
i
=
p
i
∑
n
j=1
p
j
, ˆq
i
=
q
i
∑
n
j=1
q
j
The KL distance is defined as
d
kl
= d( ˆq, ˆp) =
n
∑
i=1
ˆq
i
log
ˆq
i
ˆp
i
How to select S can be critical. To maximize the dif-
ference between p
i
’s and q
i
’s, it is best to use all the
points in I; however, the computational cost can be
prohibitive. Instead, by sampling points from I, we
typically get equivalent results as long as the sam-
pling process is reasonable. We sample points uni-
formly along path-length values. Practically, when
we choose 100 points randomly spaced at 1% seg-
ments of path-length, the results are equivalent to us-
ing all the data points.
By examining the KL distance, we can measure
how different two distributions are. However, because
p
i
and q
i
are normalized, this can be problematical.
For instance, when
ˆ
f
M
(x) and
ˆ
f
I
(x) are uniform dis-
tributions over different ranges, then all the p
i
’s are
very low, and all the q
i
’s are very high. Although two
distributions are quite different, after normalization ˆp
i
and ˆq
i
form almost identical distributions and d
kl
is
approximately and misleadingly 0.
To overcome this limitation, we introduce an ad-
ditional distance measure which represents a quanti-
tative difference between p
i
’s and q
i
’s as follows:
d
r
= |1− (
¯p
¯q
)| (3)
where ¯p = (
∑
p
i
)/n and ¯q = (
∑
q
i
)/n.
3.2 Robust Distance Measure
Human appearance in video streams varies over time.
In outdoor scenes, lighting, human pose variation and
carried objects may lead to changes in the foreground
region. To cope with such variations we employ a
robust estimation norm that adjusts the weighting of
points within the distance metric based on whether
points are inliers or outliers.
For the robust estimation, we employ the general
M-estimator of (Huber, 1977), which minimizes the
objective function,
n
∑
i=1
ρ(e
i
) =
n
∑
i=1
ρ(y
i
− x
i
T
b) (4)
where x
i
’s are independent variables, y
i
’s are data
points, b is a coefficient vector, ρ is the influence
function, and n is the number of data points.
If we define the weight function ω(e) = ρ
′
(e)/e,
and let ω
i
= ω(e
i
). Then we need to solve the follow-
ing equation to minimize (4)
n
∑
i=1
ω
i
(y
i
− x
T
i
b)x
T
i
= 0 (5)
In our approach, we define a new feature, δ
i
using
p
i
and q
i
for each sample point, s
i
, :
δ
i
=
|
q
i
− p
i
|
max(p
i
,q
i
)
When the current instance is correctly matched to a
model, most p
i
’s are similar to q
i
’s leading the δ
i
’s to
be close to 0. On the other hand, when the instance
and model are mismatched, most δ
i
’s will be greater
than 0. The mean of δ
i
will represent how well the
current instance is matched to the model. We apply
the robust fitting (5) to compute the robust mean of
the δ
i
’s, µ; it can be written as
n
∑
i=1
ω
i
(δ
i
− µ) = 0
Notice that weights are designed to minimize the in-
fluence of outliers. In other words, the weight of each
data point depends on how far the point is from the
mean. Data points near to the estimated mean get high
weight. Points that are far from the mean have smaller
weights.
We used the iteratively re-weighted least square
(IRLS) method using the bisqaure weight function to
solve the equation to get a robust mean as in (Cole-
man et al., 1980) and (Fox, 2002).
The final weights at the last iteration after the es-
timated mean converges were investigated to find in-
liers. Only data points with the weight greater than
a certain threshold value are regarded as inliers. The
two distances, d
′
r
and d
′
kl
, are recomputed using only
inliers. Fig. 2 shows examples of outliers and inliers
as determined using robust fitting method for a sam-
ple region that has been manually altered by changing
its color.
4 SPATIAL ANALYSIS
Sometimes it is possible to improve the accuracy of
the models in the gallery and the matching perfor-
mance by utilizing the relative order of participants.
We perform this as follows.
For each model, M
i
, we compute an adjacency
matrix, F
i
that captures the frequency of spatial or-
dering among models. An adjacency matrix, F
i
is
m × n, where n is the number of models and m in-
dexes relative positions. For example, if N is the
SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications
338