the decoder copies the luma samples from the refer-
ence frame. SAE
noSKIP
is the SAE of the decoded
MB (compared with the original, uncoded MB) and
is an approximate measure of distortion, if the MB is
not skipped. SAE
SKIP
is calculated as the first step
of a Motion Estimation algorithm in the encoder and
is readily available at an early stage of processing of
each MB. SAE
noSKIP
is not normally calculated dur-
ing coding or decoding, and cannot be calculated if
the MB is actually skipped. Therefore, a model for
SAE
noSKIP
is used, to estimate SAE
di f f
. More specif-
ically, given a MB at position i in frame n, the value
of SAE
noSKIP
is set equal to the SAE of the most re-
cent available decoded MB in position i. Experimen-
tal results have shown this is a good predictor for
SAE
noSKIP
; the encoder has to compute and store this
value for each coded MB. Older values of SAE
noSKIP
for position i are replaced when a new MB is coded.
The MB skipping model compares SAE
di f f
to a
threshold: when lower, the current MB is skipped,
otherwise it is encoded. The threshold controls the
proportion of skipped MBs: a higher threshold results
in an increased number of skipped MBs, but also an
increased distortion, due to incorrectly skipped MBs.
3.2 Active Macroblock Skipping based
on Flexible Macroblock Order
The MB skipping strategy described in (Beesley,
2005) exploits one of the error resilience tools avail-
able in H.264/AVC, the Flexible Macroblock Order
(FMO), that allows a picture to be partitioned into
one or more slices, so that every MB in a missing
slice is likely to have all of its neighbours, from a cor-
rectly decoded slice, available for concealment pur-
poses at the decoder. According with the proposed
method, MBs selected for skipping are deliberately
removed during the encoding stage, in the knowledge
that they can be effectively concealed during the de-
coding phase. This provides a significant reduction
in bit stream size, with a resulting effect on quality
that, obviosuly, depends on the concealment options
activated at the decoder.
The proposed algorithm computes individual MB
PSNR values, along with the PSNR values should
each MB be concealed using weighted pixel value av-
eraging, i.e. one of the several concealment strategies
available. The resulting PSNR values are then com-
pared: if the concealed MB has an improved PSNR
over its decoded equivalent, then the MB is marked
for possible removal from the encoded bit stream. Ac-
tually, the MB is really removed only in the case all
its neighbouring MBs are not removed, because they
are necessary to perform concealment at the decoder.
The selection of the removable MBs is performed by
means of a list of all possible candidates, ordered by
their bit stream size, so that the MBs with the largest
potential savings are given the highest priority.
3.3 A Fast Algorithm for Inter-mode
Selection
In (Yu, 2004), the authors present an improved ver-
sion of the so-called Modified Fast Inter-mode selec-
tion (MFInterms) algorithm, to provide a more effi-
cient prediction of Mode Decision. The strategy in-
cludes temporal similarity detection, and the detec-
tion of different moving features within a MB.
The basic idea exploited by the suggested algo-
rithm is that a mode having a smaller partition size
may benefit detailed areas, whereas a larger partition
size is more suitable for homogeneous areas. Gener-
ally, a homogeneous MB is more likely to require ex-
amination of fewer inter-modes, compared to a highly
detailed MB. Two measurements are included in the
algorithm, targeted at MBs encoded with SKIP mode,
and MBs encoded by the inter-modes with larger de-
composed partition size (greater than 8 × 8 pixels):
the temporal similarity between two MBs, and the
motion consistency of a MB. Macroblocks coded with
the SKIP mode can be easily detected by comparing
the residue between the current MB and the previ-
ously encoded MB with a threshold, as follows:
S
residue
=
∑
m
∑
n
|
B
m,n,t
− B
m,n,t−1
|
(5)
T (S
residue
) =
1, S
residue
< T h
ASV
0, S
residue
> T h
ASV
(6)
where S
residue
is the sum absolute difference between
B
m,n,t
and B
m,n,t−1
, which represent current and previ-
ous MBs, respectively. The temporal similarity is im-
plemented by means of an adaptive spatially varying
threshold, Th
ASV
, which depends on a constant and
the sum absolute difference of four nearest encoded
neighbours. This is motivated by the fact that, gener-
ally, the skipped MBs tend to occur in clusters, like in
a part of a static background; a MB undergoes tem-
poral similarity detection if one of the encoded neigh-
bours is a skipped MB.
Besides the temporal similarity computation, the
proposed algorithm suggests checking the motion
vector of each 8 × 8 block decomposed from a highly
detailed MB. If consistency among motion vectors ex-
ists, the inter modes with partition size greater than
8 × 8 are checked, otherwise, all possible inter modes
are searched.
SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications
62