3 COMPLEXITY VERIFICATION
In this section, the models of evaluated transform cod-
ing are being verified in order to verify the real time
processing possibilities. According to the simplifica-
tion applied in Section 2, the criterion for examina-
tion of real time processing is only the calculation of
3-D DCT.
The algorithms mentioned in Subsections 2.1
and 2.2 were programed for digital signal proces-
sor TMS320C6711 from Texas Instruments in C lan-
guage and in so-called linear assembler. The linear
assembler is an interstage between high level C lan-
guage and low level assember code. The floating-
point processor TMS320C6711 contains eight func-
tional units such as hardware multiplier unit or unit
for memory accessing, 32 32-bit registers and it is
controlled by the clock signal with a relatively low
frequency of 150MHz. The basic tool for evaluating
of the algorithm velocity is a total number of CPU
cycles. Every instruction of DSP has a specific num-
ber of needed cycles and the number depends on the
type of instruction. In general the most consuming
instructions are the accessing the memory and multi-
plication in double or even in extended floating-point
precisions.
The total number of needed CPU cycles for 8-
point and 4-point 1-D DCT is shown in Table 1 and in
Table 2 respectively. The estimated time for encod-
ing the grey scaled video sequence with dimensions
of 720 ×576 ×24 are shown as well. The parameters
-o0, -o1, -o2 and -o3 correspond to the level of source
code optimizing. Parameter -o0 enables the register
level optimizing, -o1 and -o2 starts function level op-
timizing and parameter -o3 corresponds to the opti-
mizing on file level. The optimizing could be done
by the Code Composer Studio development software
from Texas Instruments.
An example of using the 4-point version of DCT
encoder for real video sequence processing is shown
in Fig. 1, where one frame from original sequence
and three details with different quality levels are pre-
sented.
Table 1: CPU cycles for 8-point 1-D DCT and duration of
transforming a grayscale video sequence (720 ×576 ×24,
f
CPU
= 150MHz).
C code ASM code
Param. Cycles Time [s] Cycles Time [s]
no opt. 57,655 22.42 10,284 4.00
-o0 52,759 20.51 10,284 4.00
-o1 26,226 10.20 5,206 2.02
-o2 15,527 6.04 2,144 0.83
-o3 15,527 6.04 2,144 0.83
Table 2: CPU cycles for 4-point 1-D DCT and duration of
transforming a grayscale video sequence (720 ×576 ×24,
f
CPU
= 150MHz).
C code ASM code
Param. Cycles Time [s] Cycles Time [s]
no opt. 5,470 17.01 1,054 3.28
-o0 4,615 14.35 1,054 3.28
-o1 2,975 9.25 702 2.18
-o2 1,583 4.92 417 1.30
-o3 1,583 4.92 417 1.30
Figure 1: A frame of tested sequence ”high jump” encoded
by 4-point encoder version with different quality levels.
It can be seen the combination of lower level pro-
gramming languages and the optimizing tools are un-
avoidable to achieved the code effective applications.
Also it can be seen the only possibility for encod-
ing a grey scaled sequence with PAL resolutions on
DSP TMS320C6711 (controlled by f
CPU
= 150MHz)
is using the 8-point fast algorithm version with maxi-
mal level of optimizing.
4 CONCLUSIONS
The contribution was focused into the video compres-
sion domain, and mainly into the modeling of the real
time 3-D DCT encoding system. This 3-D system is
used to replace two ways of video compression, i.e.
intraframe and interframe coding. In the paper two
versions of fast 1-D algorithms were outlined, for 8-
point and 4-point DCT system. It was proved the total
SIGMAP 2008 - International Conference on Signal Processing and Multimedia Applications
210