coefficients are well approximated by the
Generalized Gaussian Function while the low
frequency and DC coefficient are well approximated
by a mixture of several Generalized Gaussian
Function. In Eude, Cherifi and Grisel (1994), the
DCT coefficients were approximated by a mixture of
Gaussian distribution model and based on this model
a DCT-based compression technique was developed.
This compression algorithm employs a quantization
table that is a modification of the JPEG quantization
table according to their distribution model. Results
indicated superior visual quality in comparison to
that of JPEG.
In this paper, a progressive statistical and DCT
(SDCT) based image-coding scheme is presented.
The proposed coding scheme divides the input
image into a number of non-overlapping blocks and
applies a DCT on coefficients in each block. The
coefficients with the same frequency indices at
different DCT blocks are grouped together and make
a number of matrices. The matrix containing the DC
coefficients is losslessly coded. The matrices
containing high frequency coefficients are coded
using a novel statistical encoder, which is developed
in this paper. The proposed statistical encoder
applies a hierarchical estimation algorithm to code
the coefficients in each matrix. The hierarchical
estimation algorithm assumes that the distributions
of the coefficients in the matrices are Gaussian in
some regions. A threshold on the variance of the
coefficients is used to determine if it is possible to
estimate the coefficients in the input matrix with the
mean value of a single Gaussian distribution or it
needs further dividing into four sub-blocks. This
hierarchal algorithm is repeated until the distribution
of the coefficients in all sub-blocks fulfils the above
criteria. Finally, the mean value of the Gaussian
distribution of each block is taken as an estimation
value for all coefficients in that block. During the
encoding process a quadtree-like binary map is
generated to save a record of the hierarchical
operation, which is used in decoding process. The
rest of the paper is organized as follows: in Section
2 the proposed coding scheme is discussed; Section
3 explains the decoder; experimental results are
presented at Section 4; and finally Section 5
concludes the paper.
2 PROGRESSIVE STATISTICAL
AND DCT BASED IMAGE
ENCODER
A block diagram of the Progressive Statistical
Discrete Cosine Transform (SDCT) based image
encoder is illustrated in Figure 1. A gray scale image
is input to the encoder. The encoder divides the
input image into a number of 8×8 non-overlapping
pixel blocks called B
11
to B
nn
as shown in Figure
1(a). The coefficients in each block are then
transformed into the frequency domain using a DCT
as shown in Figure 1(b) where A
0-ij
to A
63-ij
are DCT
transformed coefficients in the B
ij
block. The
coefficients with the same frequency indices at
different blocks are then grouped together and
generate 64 matrices called M
0
to M
63
, where M
0
contains the DC coefficients and M
1
to M
63
contain
the AC coefficients from the lowest to the highest
frequency respectively. Figure 1(c) shows one of
these matrices (M
k
), where A
k-11
to A
k-nn
in this
matrix represent the coefficients with the same
frequency index (k), which can take a value between
1 and 63, at different transformed blocks. In this
figure, indices 11 to nn represent the position of the
block that the coefficients belong to.
Figure 1(d) illustrates the encoding stage of the
64 matrices. The M
0
, which contains most of the
image energy, is losslessly coded, using lossless
DPCM method. The M
1
to M
63
matrices are coded
individually using the following operations: (i)
Coefficients in each matrix are first level shifted to
have a minimum value (Min) of zero; (ii) the
resulting coefficients are then coded using a novel
statistical encoding algorithm, which is presented in
Sub-section 2.2. The statistical encoder takes
coefficients in each matrix and a threshold value
(generated specifically for that matrix (detailed in
Sub-section 2.1)), and performs the encoding
process. The output of this encoder is a mean vector
(mv), which carries the mean values, and a binary
vector (q), which carries the quadtree-like data. (ii)
Finally a multiplexor puts the encoded information
together and generates a bitstream called BS
L
, where
L specifies the correspondent matrix, as shown in
Figure 1d. The resulting bitstreams are transmitted
from BS
0
to BS
63
sequentially to perform
progressive image transmission.
VISAPP 2008 - International Conference on Computer Vision Theory and Applications
86