CONSIDERATIONS ON THE FFT VARIANTS FOR AN EFFICIENT STREAM IMPLEMENTATION ON GPU
José G. Marichal-Hernández, Fernando Rosa, José M. Rodríguez-Ramos
2006
Abstract
In this article, the different variants of the fast Fourier transform algorithm are revisited and analysed in terms of the cost of implementing them on graphics processing units. We describe the key factors in the selection of an efficient algorithm that takes advantage of this hardware and, with the stream model language BrookGPU, we implement efficient versions of unidimensional and bidimensional FFT. These implementations allow the computation of unidimensional transform sequences of 262k complex numbers under 13 ms and bidimensional transforms on sequences of size 1024x1024 under 59 ms on a G70 GPU, that is almost 3.4 times faster than FFTW on a high-end CPU.
References
- Buck, I. (2004). Brook specification v.0.2. Tech. Rep. CSTR 2003-04 10/31/03 12/5/03, Stanford University.
- Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., and Hanrahan, P. (2004). Brook for GPUs: stream computing on graphics hardware. ACM Trans. Graph., 23(3):777-786.
- Cooley, J. W. and Tukey, J. W. (1965). An algorithm for the machine calculation of complex fourier series. Mathematics of Computation, 19:297-301.
- Dally, W. J., Hanrahan, P., Erez, M., Knight, T. J., and alter (2003). Merrimac: Supercomputing with streams. In SC'03, Phoenix, Arizona.
- Frigo, M. and Johnson, S. (2005). The design and implementation of FFTW3. In Proc. of the IEEE, volume 93, pages 216- 231. http://www.fftw.org.
- Jansen, T., von Rymon-Lipinski, B., Hanssen, N., and Keeve, E. (2004). Fourier volume rendering on the GPU using a Split-Stream-FFT. In Proc. of the VMV'04, pages 395-403. IOS Press BV.
- Loan, C. V. (1992). Computational frameworks for the fast Fourier transform. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA.
- Moreland, K. and Angel, E. (2003). The FFT on a GPU. In Proc. of the ACM SIGGRAPH, pages 112-119. Eurographics Association.
- Nussbaumer, H. J. (1982). Fast Fourier Transform and Convolution Algorithms. Springer-Verlag, second edition.
- Pease, M. C. (1968). An adaptation of the fast fourier transform for parallel processing. J. ACM, 15(2):252-264.
- Püschel, M. and et al., J. M. F. M. (2005). SPIRAL: Code generation for DSP transforms. Proc. of the IEEE, 93(2).
- Schiwietz, T. and Westermann, R. (2004). GPU-PIV. In Proc. of the VMV'04, pages 151-158. IOS Press BV.
- Stockham, T. (1966). High speed convolution and correlation. In AFIPS Proceedings, volume 28, pages 229- 233. Spring Joint Computer Conference.
- Swarztrauber, P. N. (1987). Multiprocessor FFTs. Parallel computing, 5(1-2):197-210.
- Viola, I., Kanitsar, A., and Gröller, M. E. (2004). Gpubased frequency domain volume rendering. In Proc. of SCCG 2004, pages 49-58.
Paper Citation
in Harvard Style
G. Marichal-Hernández J., Rosa F. and M. Rodríguez-Ramos J. (2006). CONSIDERATIONS ON THE FFT VARIANTS FOR AN EFFICIENT STREAM IMPLEMENTATION ON GPU . In Proceedings of the First International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, ISBN 972-8865-40-6, pages 80-86. DOI: 10.5220/0001361900800086
in Bibtex Style
@conference{visapp06,
author={José G. Marichal-Hernández and Fernando Rosa and José M. Rodríguez-Ramos},
title={CONSIDERATIONS ON THE FFT VARIANTS FOR AN EFFICIENT STREAM IMPLEMENTATION ON GPU},
booktitle={Proceedings of the First International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP,},
year={2006},
pages={80-86},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001361900800086},
isbn={972-8865-40-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the First International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP,
TI - CONSIDERATIONS ON THE FFT VARIANTS FOR AN EFFICIENT STREAM IMPLEMENTATION ON GPU
SN - 972-8865-40-6
AU - G. Marichal-Hernández J.
AU - Rosa F.
AU - M. Rodríguez-Ramos J.
PY - 2006
SP - 80
EP - 86
DO - 10.5220/0001361900800086