M
j
1,t+1,x
= αO
j
S,t,x
v
j
t,x
+ (1− α)M
j
1,t,x
,
M
j
1,t+1,y
= αO
j
S,t,y
v
j
t,y
+ (1− α)M
j
1,t,y
M
j
1,t+1,xy
= αO
j
S,t,xy
v
j
t,x
v
j
t,y
+ (1− α)M
j
1,t,xy
M
j
2,t+1,x
= αO
j
S,t,x
(v
j
t,x
)
2
+ (1− α)M
j
2,t,x
,
M
j
2,t+1,y
= αO
j
S,t,y
(v
j
t,y
)
2
+ (1− α)M
j
2,t,y
.
(5)
The stable component is updated using the first or-
der data moments as:
s
j
t+1,x
= µ
j
S,t+1,x
=
M
j
1,t+1,x
m
j
S,t+1,x
,
s
j
t+1,y
= µ
j
S,t+1,y
=
M
j
1,t+1,y
m
j
S,t+1,y
. (6)
The stable component new covariance matrices
are evaluated as:
(σ
j
S,t+1,x
)
2
=
M
j
2,t+1,x
m
j
S,t+1,x
− (s
j
t+1,x
)
2
,
(σ
j
S,t+1,y
)
2
=
M
j
2,t+1,y
m
j
S,t+1,y
− (s
j
t+1,y
)
2
(σ
j
S,t+1,xy
)
2
=
M
j
1,t+1,xy
m
j
S,t+1,xy
− (s
j
t+1,x
)(s
j
t+1,y
). (7)
The wander component contains the current mo-
tion vectors, since it adapts as a two frame motion
change model, while covariance matrices for the wan-
der and lost components are updated according to sta-
ble component’s covariance matrices in order to avoid
some prior preference in either component.
4 EXPERIMENTAL RESULTS
We have evaluated the efficiency of our algorithm
through experimental testing. Experiments have been
conducted in a dataset comprised of infrared video
streams captured by a hand-held video camera. We
present the results obtained by applying our method in
a video sequence containing 485 frames, where cam-
era performs a 360 degrees spin. Figure 1 presents
the variation of the estimated affine coefficients de-
scribing the translation over the x axis (dotted black
line) and y axis (solid gray line) as the video stream
evolves, while at specific moments the respective
video frames are provided for visual confirmation of
the obtained results. As it is depicted both parameters
balances around zero during the first 25 frames since
the camera remains almost still. A radical incense-
ment in the coefficient describing translation over the
x axis occurs from frame 26 and until the end of the
video stream since camera starts to spin. On the other
hand, the coefficient that corresponds to translation
over the y axis continuously balances around zero
since there is minimum movement towards that direc-
tion.
0 50 100 150 200 250 300 350
400 450 500
-1
-0.5
0
0.5
1
1.5
2
2.5
Frame index
Translation factor
Translation in x axis
Translation in y axis
Figure 1: Variation of the translation affine coefficients.
5 CONCLUSIONS
n this paper a novel camera motion estimation method
based in motion vector fields exploitation is pro-
posed. The features that distinguish our method from
other proposed camera motion estimation techniques
are: 1) the integration of a novel stochastic vector
field model, 2) the incorporation of the vector field
model inside a particle filters framework enabling the
method to estimate the future camera movement.
REFERENCES
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977).
Maximum likelihood from incomplete data via the em
algorithm. Journal of the Royal Statistical Society. Se-
ries B (Methodological), 39(1):1–38.
Duan, L.-Y., Jin, J. S., Tian, Q., and Xu, C.-S. (2006). Non-
parametric motion characterization for robust classifi-
cation of camera motion patterns. IEEE Transactions
on Multimedia, 8(2):323–340.
Gordon, N., Salmond, D., and Smith, A. (1993). Novel ap-
proach to nonlinear/non-Gaussian bayesian state esti-
mation. In Radar and Signal Processing, IEEE Pro-
ceedings F, volume 140, pages 107–113.
Hu, W., Xie, D., Fu, Z., Zeng, W., and Maybank, S.
(2007). Semantic-based surveillance video retrieval.
IEEE Transactions on Image Processing, 16(4):1168
– 1181.
Jain, J. R. and Jain, A. K. (1981). Displacement mea-
surement and its application in interframe image cod-
ing. IEEE Transactions on Communications, COM-
29(12):1799–1808.
VISAPP 2008 - International Conference on Computer Vision Theory and Applications
672