mismatch between the transmitting side and the
receiving side, introduced by the errors of unreliable
network transmission. How to solve the mismatch
problem is the primary task for multiple description
video coding. To eliminate the mismatch, motion
estimation and compensation were individually
performed in each description (Tillier, 2007), which
lowered the coding efficiency and increased the
computational complexity.
As the emergence of H.264 coding standard,
several H.264 based multiple description video
coding method were proposed to improve the error
resilient ability of H.264. Bernadini and Durigon
proposed a pholyphase spatial subsampling multiple
description coding (PSS-MDC) (Bernardini, 2004),
which separated the source video into four
subsequences by spatial subsampling, and then
coded each subsequence by an H.264 encoder to
form four bitstreams, which were transmitted
through diverse channels. Even three of them were
lost, the method was able to insure an acceptable
reconstructed quality of the input video. Based on
PSS-MDC, Wei et. al. paired subsequences to form
two descriptions and proposed a neighboring pixel
prediction algorithm to reduce the redundancy of
PSS-MDC (Wei, 2006). Campana et. al. innovated
the MDSQ by mapping more zero coefficients to
improve the coding efficiency (Campana, 2008). A
slice optimal allocation for H.264 multiple
description video coding was proposed by Tillo
(Tillo, 2008). However, it was not suitable for low
bitrate transmission.
Stimulated by the thought of layered based MDC
scheme, we proposed an H.264 and dual-tree
discrete wavelet transform based multiple
description video coding. The base layer is produced
by feeding input into an H.264 encoder with low
bitrate and copied into each description. The
enhancement layer is formed by four trees of
wavelet transform coefficients outputed by the dual-
tree discrete wavelet transform. These coefficient
trees are paired into two sets and sent them into
separate description. Each description comprises a
base layer and an enhancement layer, which is
transmitted through diverse channel. The simulation
results have shown the efficiency and the error
resillient ability of the proposed method.
The rest of this paper is organized as follows.
H.264 and dual-tree discrete wavelet transform is
introduced in Section 2. Section 3 concentrates on
how to generate descriptions of the proposed
multiple description video coding scheme.
Simulation results and analysis are illustrated in
Section 4. Concluding remarks are given in Section
5.
2 DUAL-TREE DISCRETE
WAVELET TRANSFORM
2.1 Dual-tree Discrete Wavelet
Transform (DT-DWT)
To improve the directional selection and shift-
invariant property of the traditional discrete wavelet
transform (DWT), Nick Kingsbury proposed the
dual-tree complex wavelet transform (DT-CWT) in
1998. Its directional subband decomposition make
DT-CWT nearly shift-invariant and higher
directional selectivity. However, the DT-CWT is an
over-complete transform with plenty of redundancy
(2
n
:1 for n-dimensional signal). By analysis of the
real part and imaginary part of wavelet coefficients,
Selesnick found these two parts have the same
directional selectivity and either one could serve as a
wavelet transform to halve the number of
coefficients. In this light, Selesnick (Selesnick, 2005)
proposed the dual-tree discrete wavelet transform
(DT-DWT). Figure 1 shows the six high frequency
directional subbands of 2D DT-DWT.
Figure 1: The directional subbands of 2D DT-DWT.
2.2 Implementation of 3D Dual-tree
Discrete Wavelet Transform
3D dual-tree discrete wavelet transform can be
separately performed by four 3D discrete wavelet
transforms (Wang, 2007). Figure 2 shows one of the
transform, where
denotes the convolution
operation,
2
is down sampling by 2;
,
y
and
z
represent three directions of axis, horizontal, vertical
and time, respectively;
0
h
and
1
h
are low-pass filter
and high-pass filter, which form a Hilbert
transformed pair, and insure the perfect
reconstruction of the discrete wavelet transform. For
convenient,
0
hx
is used to denote the convolution
3DDual-TreeDiscreteWaveletTransformBasedMultipleDescriptionVideoCoding
181