2D video encoding algorithms removes only the spa-
tial and temporal redundancies) (Chen et al., 2016).
After compression, the video sequence streamed to-
ward the receiver that can be from one or more dif-
ferent communication channels. Since various chan-
nels may deal with different reliability conditions, the
video sequence may fail to be decoded and displayed
successfully. Therefore, a robust streaming method is
required to avoid such failure especially for the com-
munication channel with a large variance of noise. As
described in (Kazemi et al., 2014; Rahimi and Joslin,
2018), multiple description coding is a favourable
method to stream video sequence reliably when the
variance of noise is so large to use forward error cod-
ing (FEC) to correct the failure; however, there are
several joint MDC and FEC algorithms that are not
our method of choice because of its complexity and
also assuming the variance of noise is too large to use
FEC.
An MDC method partitions a video stream into
several separately decodable descriptions and then
they are streamed through the network. This way,
first, if enough resource (for example bandwidth) is
not available to receive all descriptions successfully,
a subset of all descriptions can be received and de-
coded; so, It can be said that at least a lower qual-
ity version of the original video is available and will
be displayed. Second, errors that happen in one
or more descriptions can be fixed considering other
error-free descriptions. These two advantages make
MDC method as a powerful strategy to avoid packet
failure in multimedia communication for either wired
or wireless networks (Kazemi, 2012; Padmanabhan
et al., 2003).
In contrast to the error resiliency aspect of the
MDC method, the coding efficiency will be degraded
since each description needs to be included some ex-
tra information as the header(Baccaglini et al., 2010).
The cost of decreasing in compressing ratio is un-
avoidable as each description needs to be separately
decodable. Although, that is not the only cost and the
compressing ratio is decreasing more as the data in
each description is not as dependent as it was in the
original description. Therefore the differential pulse
code modulation(DPCM) technique used by the en-
coder is not as efficient as it was before. To increase
the coding efficiency, more correlated data is required
to be assigned in one description, however, this weak-
ens the estimation power of a missed description
from other available descriptions. Therefore, there is
an error resiliency-coding efficiency trade-off prob-
lem(Kazemi et al., 2014; Rahimi and Joslin, 2018).
The domain that is chosen for the partitioning
video data determines the type of an MDC method,
that can be spatial, temporal, or frequency type. It
is worth mentioning among various MDC types, the
temporal MDC is more common since it is very sim-
ple and provides better performance. However, when
the noise variance is very large and more than two de-
scriptions are required, its performance degraded dra-
matically and other MDC types are more favourable.
In this paper, we proposed a hybrid MDC method
that gains from both spatial and temporal MDC types’
benefits.
This remain of this paper organizes as follows:
a brief literature review and our motivation are pre-
sented in Section 2. Then, the proposed method will
be introduced in Section 3 and afterward, test results
will be presented and discussed in Section 4. Finally,
we have a brief review of our achievement in Sec-
tion 5.
2 STATE OF THE ART
Generally, temporal MDC type is more favourable
since it is very simple to implement and also it pro-
vides a better performance against the network failure
compared to the spatial or frequency MDC methods;
however, the temporal MDC approach is more sensi-
tive to the variance of the noise and also coding ineffi-
ciency is more severe for a temporal MDC type when
the number of descriptions is more than two(Kazemi
et al., 2014). Also, to have the best estimation power,
a temporal MDC predicts the lost frame bidirection-
ally and therefore an MDC decoder needs to receive
the later frame to predict the corrupted frame. This
may cause it unsuited for applications that ”time” is
very important and initially are designed to support
live streaming (Rahimi and Joslin, 2018).
The second option is to use spacial MDC type as
it is simple to implement and needs a decoder with
lower complexity compared to the frequency MDC
type; however, the quality of reconstruction of lost de-
scription is lower than the temporal MDC type. To
improve its performance a nonidentical decimation
algorithm has been proposed in (Rahimi and Joslin,
2018) which is designed initially for 3D videos and
does not increase complexity significantly as required
for a live streaming application. By this algorithm,
interesting objects in the scene are detected, first, and
then, they are assigned more bandwidth compared to
other parts which are mainly the background. In ad-
dition to improvement measured by the objective as-
sessment presented in that work, it can also provide
much better performance in view of subjective assess-
ment; because human eyes are more sensitive to the
objects rather than that of pixels and it is more im-
3D Video Spatiotemporal Multiple Description Coding Considering Region of Interest
475