Parallel Version n-Dimensional Fast Fourier Transform Algorithm

Analog of the Cooley-Tukey Algorithm

M. V. Noskov and V. S. Tutatchikov

Institute of Space and Information Technology, Siberian Federal University, Kirenskogo Street 26, Krasnoyarsk, Russia

Keywords: Multi-dimensional Discrete Fourier Transform, Cooley-Tukey FFT, Parallel Algorithm.

Abstract: One-, two- and three-dimensional fast Fourier transform (FFT) algorithms has been widely used in digital

processing. Multi-dimensional discrete Fourier transform is reduced to a combination of one-dimensional

FFT for all coordinates due to the increased complexity and the large amount of computation by increasing

the dimensional of the signal. This article provides a general Cooley-Tukey algorithm analog, which

requires less complex operations of additional and multiplication than the standard method, and runs 1.5

times faster than analogue in Matlab.

1 INTRODUCTION

One-, two- and three-dimensional fast Fourier

transform (FFT) algorithms has been widely used in

digital processing (Dudgeon, 1983, Blahut, 1985).

Multi-dimensional discrete Fourier transform is

reduced to a combination of one-dimensional FFT

for all coordinates due to the increased complexity

and the large amount of computation by increasing

the dimensional of the signal. This article provides a

general Cooley-Tukey algorithm analog, which

requires less complex operations of additional and

multiplication than the standard method

(Tutatchikov, 2013). Testing of the resulting

algorithm in two- and three-dimensions in

comparison with the standard algorithm in Matlab

(Gonzalez, 2009).

2 THE ALGORITHM

DESCRIPTION

Let us have a look at the signal

f

, which is an n-

dimensional periodic signal with a period

s

2 of

over all n coordinate with values in a complex space.

The counts are given as



nxx

xxff

n

,...,

1,...,

1



,

where

nix

i

,...,1, 

take values 12,...,1,0 

s

.

The discrete Fourier transformation (DFT)





nyy

yyFF

n

,...,

1,...,

1



for the signal





n

xxf ,...,

1

is given in the formula:

 



s

nn

ss

n

yxyxi

xx

nn

e

xxfyyF

2

...2

12

0

12

0

11

1

,...,...,...,



















(1)

where

niy

i

,...,1,



take values 12,...,0 

s

.

2.1 n-Dimensional FFT

Transform the formula (1) as follows:



n

s

n

s

aa

ayayi

bb

bbaa

n

s

n

bb

s

n

bybyg

e

by

byFyyF

n

s

nn

n

nn

n

11

1

11

1

,...,

2

...2

...

1

0

1

0

1

0

1

0

11

1

0

1

0

1

11

1

2,...,2

1

......)2...,

,2(...,...,

1

11

1



























(2)

where coordinates

1

i

y

of the final counts subsignals

1

,...,

1 n

aa

g

run

1

2

s

values,

12:0

11



s

i

x

,

114

Noskov M. and Tutatchikov V..

Parallel Version n-Dimensional Fast Fourier Transform Algorithm - Analog of the Cooley-Tukey Algorithm.

DOI: 10.5220/0005461401140117

In Proceedings of the 5th International Workshop on Image Mining. Theory and Applications (IMTA-5-2015), pages 114-117

ISBN: 978-989-758-094-9

Copyright

c

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

ni :1

,

1

F - FFT of source signal

f

. For

convenience, denote

fF 

0

:



   



s

n

s

nnn

s

ss

n

s

nn

n

byaxbyaxi

xx

nn

ayayi

aa

bb

n

e

axaxF

e

yyF

2

22...222

12

0

12

1

0

2

...2

1

0

1

0

...

1

0

1

0

1

111

1

11

1

2,...,2...

1...

...,...,































(3)

Continue the same procedure for each

1

,...,

1 n

aa

g

, that

is represented signal

1

,...,

1 n

aa

g

as a sum subsignals:









2

,...,

1

,...,

11 nn

gg

aa

(4)

where coordinates of the final counts subsignals

2

,...,

1 n

g



run

2

s

values.

Continuing this process, we can be represented



n

yyF ,...,

1

as the sum of DFT signals, wherein

each of the n coordinates counts runs on only two

values, we obtain the following formula for

calculating



n

v

yyF ,...,

1

:



)2

2,...,22

(...

1...

...,...,

1

1,...,

2

)2(2

1

0

1

0

2

)2(2

...

12

0

1

0

1

0

1

c

bycb

yge

e

yyF

vs

n

vsv

n

vsvs

vv

aa

acyi

aa

acyi

bb

cb b

n

v

n

s

n

vsv

n

s

vsv

n

v

n















 

















 



(5)

where

sv :1

- step number of the partition



n

xxF ,...,

1

on the subsignals.

Consider in more detail the formula (5):



 



 



s

vs

n

sv

n

vs

n

v

n

s

vssvvsv

vs

v

vs

v

n

s

n

vsv

n

s

vsv

n

v

n

v

n

cbycaxi

vs

n

v

n

vsvv

xx

acyi

aa

acyi

bb

cb b

vs

n

vs

v

n

vsvsvv

cb b

n

v

e

ca

xcaxF

e

cb

ycbyF

yyF

2

22222

2

22222

1

11

1

12

0

12

2

)2(2

1

0

1

0

2

)2(2

...

12

0

1

0

1

0

1

11

12

0

1

0

1

0

1

111

1

11

1

...

)2

2,...,22(

......

1...

...)22

,...,22(

...,...,































 





 



















 



(6)

where

12:0, 

vsv

i

v

i

yx

,

ni :1

,









nn

s

yyFyyF ,...,,...,

11



- discrete Fourier

transformation

f

.

2.2 Parallel Algorithm FFT

Calculation





n

yyF ,...,

1

can be parallelized on

independent flows calculations. In the presence

sq

q

0,2 of flow formula (6) takes the form:



)22,...,2

2,2

22(

...,...,

11

22

2

1

11

0

12

0

12

0

1

0

1

0

1

cbyc

byt

cbyF

yyF

vs

n

vsv

n

pvs

pvsvpvs

pvspvsvv

q

pt c b b

n

v

pv

n















 









 



(7)

Consider in more detail the formula (7):

ParallelVersionn-DimensionalFastFourierTransformAlgorithm-AnalogoftheCooley-TukeyAlgorithm

115



)22,...,2

2,22

2(...

)1(...

...,...,

11

22

21

11,...,

2

))2((2

2

))2((2

2

))22((2

1

0

1

0

...

1

12

0

12

0

1

0

1

0

1

2

1

21

1

11

1

cbyc

bytc

byge

ee

yyF

vs

n

vsv

n

vs

vsvpvsvs

vsvv

aa

acyi

acyiatcyi

aa

bb

q

pt cb b

n

v

n

s

n

vsv

n

s

vsv

s

pvsvsv

n

pv

n























 























  





(8)

Subsignals

v

aa

n

g

,...,

1

may be described as follows:









 



 



s

vs

n

sv

n

vs

n

v

n

s

vssvvsv

s

pvsvssv

s

pvsvsv

pvs

v

vs

v

n

s

n

vsv

n

s

vsv

s

pvsvsv

p

n

v

n

cbycaxi

tcby

tcaxi

pvsvs

n

v

n

vsvpvsvs

q

pt

xx

vv

acyi

atcyi

q

ptaa

bb

cb b

n

v

e

tcax

caxtc

axF

ee

e

yyF

2

22222

2

22222

2

222

2

2222

21

1

22

21

1

12

0

12

0

12

11

1

2

)2(2

2

)2(2

2

)22(2

1

12

0

1

0

1

0

...

12

0

1

0

1

0

1

111

1

2

1

2

1

22

21

1

21

11

1

2

1

2

1

21

1

...

)222...,

,22,22

2(...

...

1...

...,...,







 











































 





















  



 



(9)

3 THE OBTAINED RESULTS

For the algorithm testing program in the

programming language C++ has been written for

two- and three-dimensional signal. The testing was

conducted on PC with following characteristics:

Processor: AMD FX-4170 4.2 GHz;

RAM: 8 GB;

Operating system: Windows 7.

Was compared with a standard algorithm for the

discrete Fourier transform in the environment of

Matlab 7.5.0 (R2007b) in two- and three-

dimensional case. Test results are shown in seconds

in tables.

Table 1 shows a comparison runtime in seconds

of the two-dimensional FFT by analogue Cooley-

Tookey algorithm and a standard algorithm for

computing two-dimensional FFT in Matlab.

Table 2 shows a comparison runtime in seconds

of the three-dimensional FFT by analogue Cooley-

Tookey algorithm and a standard algorithm for

computing three-dimensional FFT in Matlab.

Table 3 shows a comparison runtime in seconds

of the parallel version two-dimensional FFT by

analogue Cooley-Tookey algorithm and parallel

standard algorithm for computing two-dimensional

FFT by combination one-dimensional FFT.

Table 1: Calculating 2D FFT.

Size signal

2D FFT

Matlab

2D FFT

Cooley-

Tukey

algorithm

analog

Speedup

С++

128*128 0.001 0.001 ~1

256*256 0.005 0.004 ~1

512*512 0.027 0.017 ~1.6

1024*1024 0.125 0.087 ~1.4

2048*2048 0.620 0.389 ~1.6

4096*4096 2.634 1.637 ~1.6

8192*8192 13.609 6.904 ~2

16384*16384 - 20.383

Table 2: Calculating 3D FFT.

Size signal

3D FFT

Matlab

3D FFT

Cooley-

Tukey

algorithm

analog

Speedup

С++

32*32*32 0.002 0.002 ~1.0

64*64*64 0.028 0.020 ~1.4

128*128*128 0.282 0.188 ~1.5

256*256*256 2.546 1.660 ~1.5

512*512*512 - 14.736

IMTA-52015-5thInternationalWorkshoponImageMining.TheoryandApplications

116

Table 3: Parallel calculating 2D FFT.

Size signal

Numb

er of

proces

ses

Combinati

on 1D FFT

2D FFT

Cooley-

Tukey

algorith

m analog

Speedu

p

Cooley-

Tukey

1024*1024

1 0.112 0.057 ~1.6

2 0.142 0.070 ~1.0

4 0.154 0.099 ~0.8

8 0.257 0.092 ~0.7

16 0.330 0.088 ~0.5

2048*2048

1 0.516 0.275 ~1.7

2 0.512 0.396 ~1.2

4 0.596 0.407 ~1.1

8 1.045 0.345 ~0.9

16 1.195 0.453 ~0.8

4096*4096

1 2.193 1.355 ~1.7

2 2.399 1.194 ~1.4

4 2.393 2.098 ~1.2

8 4.412 1.946 ~1.1

16 3.946 1.912 ~1.1

8192*8192

1 12.538 4.957 ~1.7

2 10.509 5.245 ~1.4

4 11.753 7.848 ~1.2

8 18.551 8.162 ~1.1

16 18.196 8.907 ~1.2

Figure 1: Example of two-dimensional signal.

4 CONCLUSIONS

The modified algorithm of the n-dimensional fast

Fourier transform by analogue of the Cooley-Tukey

algorithm requires

NN

n

2

log

2

12 

complex

operations of multiplications and

NnN

n

2

log

additions , where

s

N 2

is number of counts in

the one of the coordinates (Starovoitov, 2010).

Standard algorithm requires

NnN

n

2

log complex

multiplications and

NnN

n

2

log complex

additions. The modified algorithm requires less

complex than the standard method, and runs 1.5

times faster than analogue in Matlab.

ACKNOWLEDGEMENTS

Work performed under the state order of the

Ministry of Education and Science if the Russian

Federation in the Siberian Federal University to

perform R&D in 2014 (Task No 1.1462.2014/K).

Project title: “Algebraic and analytic methods for

creating algorithms for solving differential and

polynomial systems: factorization, resolution of

singularities and the optimal lattice”

REFERENCES

Dudgeon, D. E. and Mersereau, R. M., 1983.

Multidimensional Digital Signal Processing, Prentice

Hall.

Blahut, R. E., 1985. Fast Algorithms for Digital Signal

Processing, Addison-Wesley Press.

Tutatchikov V. S., Kiselev O. I., Noskov M. V., 2013.

“Calculating the n-Dimensional Fast Fourier

Transform”, Pattern Recognition and Image Analysis,

vol. 23, no. 3, pp. 429-433.

Gonzalez, R. C., Woods, R. E., Eddins, S. L., 2009.

Digital Image Processing Using MATLAB, Gatesmark

Publishing. Knoxville.

Starovoitov, A. V., 2010. “On multidimensional analog of

Cooley-Tukey algorithm”, Reporter Siberian State

Aerospace University named after academician

M.F.Reshetnev, no. 1 (27), pp. 69-73.

ParallelVersionn-DimensionalFastFourierTransformAlgorithm-AnalogoftheCooley-TukeyAlgorithm

117