for the whole transmission. In this case, the encoder
must allocate the bitrate among each coding unit
(e.g. frame) and control the encoder parameters, i.e.
adjust the quantization parameter, in order to spend
the allocated bits efficiently.
In this context, this paper presents an encoder
rate control technique which achieves a constant
bitrate while minimizing changes in the quality of
the decoded sequence.
2 VISNET2 WZ VIDEO CODEC
The overall Wyner-Ziv (WZ) coding architecture for
the VISNET2 video codec is illustrated in Figure 1.
This codec follows the architecture proposed in
(Brites et. al., 2006), except for the encoder rate
control module which is proposed in this paper.
The coding process starts by the division of the
video frames into key frames and Wyner-Ziv (WZ)
frames. Then, one or two key frames are encoded
using the H.264/AVC Intra mode (Wiegand et. al.,
2003) in order to guarantee that each GOP is
delimited by key frames. The quality and thus the
rate of each key frame is defined mainly by the
quantization parameter (QP).
The frames in between are WZ frames, which are
simply coded with a H.264/AVC 4×4 block-based
discrete cosine transform (DCT) followed by the
aggregation of DCT coefficients in 16 frequency
bands b
k
. Each band is uniformly quantized and
bitplanes are created and sent to the turbo encoder.
The encoder establishes the final decoded quality by
defining for each band b
k
the respective number of
bitplanes M
k
for which WZ bits are generated, i.e.
the amount of bitplanes that will have a small error
probability after turbo decoding. There are 8 4×4
quantization matrices (Brites et. al., 2006), which
define different M
k
values for each DCT band b
k
allowing to achieve different rate-distortion (RD)
performances. The quantization matrices used by
both encoder and decoder are defined by the Q
Index
parameter.
At the decoder, for each WZ frame, the side
information Y
i
, an estimate of the X
i
frame, is
created by motion compensated interpolation (MCI)
based on two references, one temporally in the past
and another in the future (for GOP = 2 the references
correspond to the key frames). Then, the DCT
transform is applied to the side information and,
with a Laplacian correlation model, soft-input
information is obtained for the turbo decoder. The
iterative turbo decoder uses the received parity bits
and the soft-input side information and attempts to
generate the decoded (with small error probability
P
e
) quantized symbol stream. If the decoding is not
successful (P
e
> 10
-3
) the decoder requests via the
feedback channel for more parity bits, until
successful decoding (P
e
< 10
-3
) is achieved. Each
bitplane of each band is turbo decoded starting from
the most significant biplane and the DC coefficient
band. A zig-zag scan order is followed for the DCT
bands. After turbo decoding all bitplanes of all DCT
bands for which WZ bits were sent, the quantized
symbol stream is obtained. Next, in the
reconstruction module, the side information is used
together with the decoded quantized symbol stream,
to obtain the decoded X
i
frame after the IDCT
transform.
Finally, the key frames and WZ coded frames are
mixed again to generate the decoded video sequence
with a quality defined by the QP (for key frames)
and Q
Index
(for WZ frames) encoding quantization
parameters. The bitrate is spent according to the side
information quality, i.e. the accuracy of the MCI
estimation.
A novel encoder rate control module is proposed
in this paper (see Figure 2) which needs as an input
the bits spent on the WZ and key frames and
allocates the available bitrate among the WZ and
Intra key frames by changing the QP (for key
frames) and Q
Index
(for WZ frames) according to the
rate control algorithm proposed in the next Section.
3 PROPOSED RC ALGORITHM
In Figure 2, the flowchart of the proposed rate
control algorithm is presented. For each GOP of the
sequence, the WZ encoder is run with some initial
value of Q
Index
and generates the parity bits. Next,
key frames are encoded with a new QP value which
is selected based on the bitrate of the previous GOP
and the predicted bitrate in the current and next
GOP. The WZ decoder, invoked in the next step,
uses these key frames and the parity bits produced
by the WZ encoder. In the last step, a new value of
Q
Index
parameter is selected according to the QP
value in order to obtain similar WZ and Intra frames
quality. The procedure is repeated up to the last
GOP.
In the next subsections, a detailed description of QP
and Q
Index
selection procedures is given.
SIGMAP 2008 - International Conference on Signal Processing and Multimedia Applications
132