A NOVEL BLOCK MOTION ESTIMATION MODEL FOR VIDEO
STABILIZATION APPLICATIONS
Harish Bhaskar and Helmut Bez
Research School of Informatics, Loughborough University
Keywords:
Video stabilization, motion compensation, motion estimation, genetic algorithms, kalman filtering.
Abstract:
Video stabilization algorithms primarily aim at generating stabilized image sequences by removing unwanted
shake due to small camera movements. It is important to perform video stabilization in order to assure more
effective high level video analysis. In this paper, we propose novel motion correction schemes based on
probabilistic filters in the context of block matching motion estimation for efficient video stabilization. We
present a detailed overview of the model and compare our model against other block matching schemes on
several real-time and synthetic data sets.
1 INTRODUCTION
Video data obtained from compact motion capture de-
vices such as hand-held, head mounted cameras, etc.
has gained significant attention in recent years. Video
stabilization, as the name suggests, deals with gen-
erating stabilized video sequences by removing un-
wanted shakes and camera motion. Several meth-
ods have been proposed in the literature for ac-
complishing video stabilization. However, the ac-
curacy of motion estimation is a key to the per-
formance of video stabilization. (Y.Matsushita and
H.Y.Shum, 2005) propose a combination of motion
inpainting and deblurring techniques to accomplish
robust video stabilization. Several other research
contributions have been made to video stabiliza-
tion including, probabilistic methods (A.Litvin and
W.C.Karl, 2003), model based methods, etc. Meth-
ods such as (M.Hansen and K.Dana, 1994)(Y.Yao and
R.Chellappa, 1995)(P.Pochec, 1995)(J.Tucker and
Lazaro, 1993)(K.Uomori and Y.Kitamura, 1990), pro-
pose to combine global motion estimation with filter-
ing to remove motion artifacts from video sequences.
These schemes perform efficiently only under re-
stricted conditions and are again limited by the effi-
ciency of the global motion estimation methodology.
(K.Ratakonda, 1998) have used an integral match-
ing mechanism for compensating movement between
frames. (T.Chen, 2000) propose a 3 stage video sta-
bilization algorithm based on motion estimation. The
process includes motion estimation for computing lo-
cal and global motion parameters, motion smoothing
for removing abrupt motion changes between sub-
sequent frame pairs and finally a motion correction
methodology for stabilization. In this paper we ex-
tend the work presented in (T.Chen, 2000) to accom-
modate a novel motion correction mechanism based
on moving average filters and Kalman filtering along-
side a motion estimation strategy that combines vec-
tor quantization based block partitioning with a ge-
netic algorithm based block search for motion esti-
mation.
2 PROPOSED MODEL
The video stabilization model proposed in this pa-
per extends a parametric motion model proposed in
(T.Chen, 2000). A detailed overview of the proposed
model in the form of a pseudo code is as follows.
Input at a time instant t two successive frame pairs
of a video sequence, f
t
& f
t+1
where 1 t N ,
where N is total number of frames in the video
Image frame f
t
is initially partitioned into 4
blocks using the vector quantization algorithm
303
Bhaskar H. and Bez H. (2007).
A NOVEL BLOCK MOTION ESTIMATION MODEL FOR VIDEO STABILIZATION APPLICATIONS.
In Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics, pages 303-306
DOI: 10.5220/0001650803030306
Copyright
c
SciTePress
described in the subsection below, Note: Every
block represents an image region
For every block b
The centroid (x
c
, y
c
) of the block is computed
A genetic algorithm as described below is used
to accurately match the block in the successive
frame f
t+1
If the genetic algorithm accurately matched the
block in frame f
t
to frame f
t+1
(with error
= 0), then the motion vector is evaluated as
(x
x, y
y) where (x
, y
) is the estimated
transformed centroid of the block in frame f
t+1
If the genetic algorithm returned non-zero
matching error then the process is repeated by
further sub dividing block.
The process is terminated either when no further
splitting is needed or a predefined block size is
reached.
If the processed frame pair is (f
t
, f
t+1
) where t =
1, then proceed to next frame pair, otherwise if
t > 1, then run motion correction using any of the
proposed filter mechanisms specified to generate
smoothed motion vectors MV
Compute the difference between the original mo-
tion vectors MV and the smoothed motion vec-
tors M V
adjust the original motion vectors by
the factor of difference MV
comp
= MV ±(M V
MV
)
Generate Stabilized frames using the original mo-
tion vector MV and compensated motion vectors
MV
comp
and represent them as f
t+1
and f
comp
t+1
Deduce the PSNR of the two versions of stabi-
lized frames using, PSNR for a gray scale image
is defined as:
10 log
10
"
255
2
1
HW
P
H
P
W
kf
t+1
f
comp
k
2
#
(1)
where, (H, W ) is the dimensionality of the frames
and f
t+1
and f
c
omp are the intensity components
of the original target and the motion compensated
images which will equal f
t+1
and f
comp
t+1
re-
spectively. PSNR values generally range between
20dB and 40dB; higher values of PSNR indicate
better quality of motion estimation.
If P SNR
comp
P SNR then use f
comp
t+1
as sta-
bilized frame for subsequent analysis otherwise
use f
t+1
.
2.1 Motion Estimation
A brief description of the algorithms is specified.
2.1.1 Block Partitioning Based on Vector
Quantization
For the block partitioning phase, we start by using
vector quantization to provide the block matching
scheme with the position of partitioning.
Set the number of codewords, or size of the code-
book to 4. This assumes that we need 4 regions to
emerge out of the image frame during the quanti-
zation process.
Initialize the positions of the codewords to
(
w
4
,
h
4
), (
w
4
,
3h
4
), (
3w
4
,
3h
4
), (
3w
4
,
3h
4
) where (w, h)
is the width and height of the block respectively.
By this we assume that the worst case partition
could be the quad-tree partition.
Determine the distance of every pixel from the
codewords using a specific criterion. The distance
measure is the sum of differences in the gray in-
tensity and the locations of the pixels.
Group pixels that have the least distance to their
respective codewords.
Iterate the process again by recomputing the code-
word as the average of each codeword group
(class). If m is the number of vectors in each class
then,
CW =
1
m
m
X
j=1
x
j
(2)
Repeat until either the codewords don’t change or
the change in the codewords is small
Associated with these 4 codewords, there are 4
configurations possible for partitioning the image
frame into blocks. The configurations arise if we
assume one square block per configuration. It is
logical thereafter to find the best configuration as
the center of mass of these 4 possible configura-
tions. The center of mass will now be the partition
that splits the image frame into blocks.
2.1.2 Genetic Algorithm Search
The inputs to the genetic algorithm are the block b
t
and the centroid (x
c
, y
c
) of the block.
Population Initialization: A population P of these
n chromosomes representing (T
x
, T
y
, θ) is gener-
ated from uniformly distributed random numbers
where,
1 n limit and limit (100) is the maxi-
mum size of the population that is user defined.
To evaluate the fitness E(n) for every chromo-
some n:
ICINCO 2007 - International Conference on Informatics in Control, Automation and Robotics
304
Extract the pixels locations corresponding to
the block from frame f
t
using the centroid
(x
c
, y
c
) and block size information
Affine Transforming these pixels using the
translation parameters (T
x
, T
y
) and rotation an-
gle θ using,
x
y
1
=
cosθ sinθ 0
sinθ cosθ 0
0 0 1
1 0 T
x
0 1 T
y
0 0 1
x
y
1
If b
t
represents the original block under consid-
eration, b
t+1
represents the block identified at
the destination frame after transformation and
(h, w) the dimensions of the block, then the fit-
ness E can be measured as the mean absolute
difference (MAD).
MAD =
1
hw
h
X
i=1
w
X
j=1
b
t
(i, j) b
t+1
(i, j)
(3)
Optimization: Determine the chromosome with
minimum error n
emin
= n where E is mini-
mum. As this represents a pixel in the block,
determine all the neighbors (N H
k
) of the pixel,
where 1 k 8.
For all k, determine the error of matching as in
Fitness evaluation.
If E(NH
k
) < E(n
emin
), then n
emin
= NH
k
Selection: Define selection probabilities to select
chromosomes for mutation or cloning. Perform
cross-over and mutation operations by swapping
random genes and using uniform random values.
Termination: Three termination criterion such as
zero error, maximum generations and stall gener-
ations. Check if any condition is satisfied, other-
wise iterate until termination.
2.2 Motion Smoothing
The work of (T.Chen, 2000) suggested the use of a
moving average low pass filter for this process. In
this paper, we extend the moving average filter to an
exponentially weighted moving average filter.
2.2.1 Exponentially Weighted Moving Average
Filter
A detailed pseudo code describing the process is as
follows.
Set the number of frame pairs across which the
moving average filter to be any scalar J
Compute the parameter alpha = (1 ÷ J)
Compute the weighting factors for every frame
pair between 1 and J as w =
i1
×(1 ),
where, 1 i J (Use these weighting factors as
a kernel for the convolution process)
Generate a vector of the motion vectors and rota-
tion parameter theta across all frames; MV and
θ
Perform Convolution to generate the smoothed
motion vectors, MV
= MV w and θ
= θw
2.2.2 Kalman Filter
A 2D Kalman filter can be used to predict motion vec-
tor of successive frames given the observation or mo-
tion vectors of the previous frames. An algorithm de-
scribing the smoothing process is listed below.
Initialize the state of the system using
(x, y, dx, dy), where (x, y) is the observa-
tion (i.e. the centroid of the block) and (dx, dy) is
the displacement of the centroids. The values of
state can be initialized using the motion estimates
between the first successive frame pair.
The state of the system S at time instant t + 1 and
the observation M at time t can be modeled using
S(t + 1) = AS(t) + N oise(Q) (4)
M(t) = S(t) + Noise(R) (5)
Initialize A and noises Q, R as Gaussian.
Perform the predict and update steps of standard
Kalman filter
Initialize state at time instant t
0
using
S
0
= B
1
M
0
and error covariance U
0
=
0
0
Iterate between the predict and update steps
Predict: Estimate the state at time instant t + 1
using S
k
= AS
k1
and measure the predicted
error covariance as U
k
= AU
k1
A
T
+ Q
Update: Update the correct, state of the system
S
k
= S
k
+ K(M
k
BS
k
) and error covari-
ance as U
k
= (I KB)U
Compute K, the Kalman gain using K =
U
k
B
T
(BU
k
B
T
+ R)
1
Smooth the estimates of the Kalman filter and
present the smoothed outcomes as MV
3 RESULTS AND DISCUSSION
Here, in this section, we present some sample results
of the stabilization task on wildlife videos taken at
a zoological park. Performance of the video stabi-
lization scheme can only be visually evaluated. We
A NOVEL BLOCK MOTION ESTIMATION MODEL FOR VIDEO STABILIZATION APPLICATIONS
305
provide some sample frames illustrating the quality
of video stabilization. Figure 1 compare the video
stabilization quality of the base-line model versus the
proposed model. As we can clearly visualize there
is quite a increased quality in the stabilized version
of the proposed model in comparison to the stabi-
lized version of the base model. The motion correc-
tion scheme using the Kalman filter was sufficient to
smooth the motion vector correctly. The reason to this
is because, the changes observed in the capture was
linear. Similarly in figures 2, we compare the quality
of video stabilization using another sample clip from
the same wildlife video. The movement of the cam-
era in this sequence was more abrupt and random in
directions. We observed that the proposed model us-
ing Kalman filter could not handle these changes well
and as well generate a good quality stabilized output.
However, the motion correction mechanism using the
exponentially weighted moving average filter could
produce much better results.
Baseline Model
Unstabilized Frame Stabilized Frame
Proposed Model
Unstabilized Frame Stabilized Frame
Figure 1: Model Performances on Video Sample Clip 3.
4 CONCLUSION
In this paper, we have presented a novel mechanism
of motion correction and block based motion estima-
tion strategy that combines vector quantization based
block partitioning mechanism with the genetic algo-
rithm based block search scheme applied to video sta-
bilization. The model was tested on several real time
datasets and the results have revealed a high degree
of performance improvement when compared to ex-
isting video stabilization model based on motion esti-
mation and filtering.
Baseline Model
Unstabilized Frame Stabilized Frame
Proposed Model
Unstabilized Frame Stabilized Frame
Figure 2: Model Performance on Video Sample Clip 6.
REFERENCES
A.Litvin, J. and W.C.Karl (2003). Probabilistic video stabi-
lization using kalman filtering and mosaicking. Image
and Video Communication and Processing.
J.Tucker and Lazaro, A. S. (1993). Image stabilization for a
camera on a moving platform. In IEEE Conference on
Communications, Computers and Signal Processing,
pages 734 – 737.
K.Ratakonda (1998). Real-time digital video stabilization
for multi-media applications. In Proceeding of the
1998 IEEE International Symposium on Circuits and
Systems, pages 69–72.
K.Uomori, A.Morimura, H. T. and Y.Kitamura (1990). Au-
tomatic image stabilizing system by full-digital signal
processing. In IEEE Transactions on Consumer Elec-
tronics, pages 510–519.
M.Hansen, P. and K.Dana (1994). Real-time scene stabi-
lization and mosaic construction. Image Understand-
ing Workshop Proceedings.
P.Pochec (1995). Moire based stereo matching technique.
In Proceedings of ICIP, pages 370–373.
T.Chen (2000). Video stabilization algorithm using a block
based parametric motion model. Master’s thesis.
Y.Matsushita, E.Ofek, X. and H.Y.Shum (2005). Full-frame
video stabilization. In IEEE International Conference
on Computer Vision and Pattern Recognition.
Y.Yao, P. and R.Chellappa (1995). Electronic image stabi-
lization using multiple image cues. In Proceedings of
ICIP, pages 191–194.
ICINCO 2007 - International Conference on Informatics in Control, Automation and Robotics
306