in the enhancement layer the ROI is the only useful
image area. Therefore spatial and quality scalability is
only achieved for the ROI, which should contain the
image area of interest for target applications. In the
following sections, the Rate-Distortion and Complexity
performance of two methods, compliant with
H.264/SVC, is evaluated and compared with
straightforward encoding without ROI.
2 H.264/SVC ROI WITH SPATIAL
SCALABILITY
The underlying idea to achieve efficient encoding of
the ROI in the higher resolution layer is to minimise
the number of bits spent in the background region of
the higher resolution images. In the base layer there is
no distinction between ROI and background. One of
the methods proposed in this work is based on coarse
quantisation of the background region and finer
quantisation of the ROI in the high resolution layer. In
this method, the macroblocks (MBs) of the background
region, i.e., outside the ROI, are encoded with the
maximum quantisation scale allowed by H.264/SVC
(Qp=51) in order to maximise the number of null
coefficients. The other method is based on setting to
zero the transform coefficients of the MBs outside the
ROI regardless their value. Note that in this case
quantisation is avoided for these MBs. In both
methods, the ROI is defined by a mask, providing a
ROI map (ROImap) which is used by the encoder to
identify the ROI MBs though it is not encoded into the
video stream.
2.1 QP
51
Outside ROI
The functional implementation of this method is
depicted in Figure 1. In each MB of the high resolution
layer, the QP value is switched between 51 and the QP
value selected for the current MB, either for MBs
located outside the ROI or within the ROI,
respectively. The ROI is not defined in the base layer,
thus the whole image is normally encoded at a lower
resolution.
Therefore, the quality of ROI MBs is much higher
than that of the MBs outside the ROI and consequently
most of the bits used in the high resolution layer are
assigned to the ROI. Note that in the high resolution
layer the only useful information that needs to be
encoded is the ROI itself, because the lower quality and
resolution of the background region provided by the
base layer should be enough for the envisaged
application.
Figure 1: Qp
51
functional diagram.
2.2 Set-to-Zero
The objective of this method is the same as the
previous one: to spend no bits in the MBs outside the
ROI and to increase the subjective quality of ROI in the
higher resolution layer. In the Set-to-Zero method, the
transform coefficients of residual blocks are set to zero
for those MB outside the ROI. Thus, the encoder sets
the syntax element coded block pattern (CBP) to 0. The
Figure 2 shows Set-to-Zero functional diagram.
Figure 2: Set-to-Zero diagram.
3 SIMULATION RESULTS
The performance of the two methods described in the
previous section was evaluated in regard to rate-
distortion and encoding complexity. Separate
experiments were carried out for Intra and Inter coding
modes. The proposed methods were implemented using
the JVT reference software, version 8.9, as a basis
framework. The test sequence “Mobile” was used in
the experiments with two layers QCIF@30fps (base
layer), CIF@30fps (enhancement layer) and two ROIs
(ROI1, ROI2) with different sizes were used. ROI1 is a
192x144 pel image region covering the area of the
calendar numbers and ROI2 is the whole calendar, as
shown in Figure 3.
In the experiments the following settings were used
for the Intra test: two spatial layers (QCIF and CIF) at
30fps; NumberReferenceFrames 1; FastSearch; Loop
Filter on. The coding parameters were as follow: for
the base layer: CABAC; Basic QP 35; FRExt no; for
layer 1: CABAC; InterLayerPred on; FRExt on. The
Inter tests the were used: two spatial layers (QCIF and
CIF); 30 frames; NumberReferenceFrames 1;
FastSearch; Loop Filter on; MaxDelay 1200; GOPsize
H.264/SVC ROI ENCODING WITH SPATIAL SCALABILITY
213