6.6 Physical Space
We see that the FPGA solution ended up very effi-
cient in terms of board space: the unit measures only
5cm×4cm. The computational density of the GPU
solutions is an order of magnitude worse, because the
AMD7850 is a PCB of about 26cm×12cm (with two
huge fans), which even does not fit in a standard desk-
top case.
7 CONCLUSIONS
In this paper, we have compared the optimal imple-
mentation of a complex image processing algorithm
on GPU and FPGA. On both target platforms, we
achieved an impressive speed-up factor, albeit with
quite different amounts of effort.
In general, we can conclude that both FPGA and
GPU platforms have important — but different — ad-
vantages. While the ultimate flexibility of a FPGA-
based system can lead to a speed performance that is
an order of magnitude better than the GPU-based plat-
form, the effort spent in developing a FPGA imple-
mentation of a certain algorithm boils down to quite
more effort as compared to the C/C++ to OpenCL-
translation needed for a GPU implementation. More-
over, we saw that the original algorithm was unsuited
to be implemented in a FPGA, even after an approx-
imated simplification, which forced us to rethink the
algorithm and move to a totally different approach. In
terms of physical space and power consumption, the
FPGA is certainly the winner.
Nevertheless, because of the ease to integrate the
GPU implementation in an existing server installa-
tion, eSaturnus chose the latter. At this moment, our
GPU code is running in a real hospital in Berlin as a
clinical try-out.
ACKNOWLEDGEMENTS
This work is partially supported by the European
Commision in the CrossRoads project of the Interreg
IVA program Border Region Flanders-Netherlands.
REFERENCES
Asano, S., Maruyama, T., and Yamaguchi, Y. (2009). Per-
formance comparison of fpga, gpu and cpu in image
processing. In International Conference on Field Pro-
grammable Logic and Applications (FPL), pages 126–
131.
Bay, H., Ess, A., Tuytelaars, T., and Gool, L. V. (2008).
Speeded-up robust features (surf). Computer Vision
and Image Understanding, 110.
Benedict, G. R., David, K., Perhaad, M., and Dana, S.
(2011). Heterogeneous Computing with OpenCL.
Morgan Kaupmann.
Cope, B., Cheung, P., Luk, W., and Witt, S. (2005). Have
gpus made fpgas redundant in the field of video pro-
cessing? In Proceedings of IEEE International Con-
ference on Field-Programmable Technology, pages
111–118.
da Silva, B., Braeken, A., D’Hollander, E., Touhafi, A.,
Cornelis, J., and Lemeire, J. (2013). Performance
and toolchain of a combined gpu/fpga desktop. In In
Proceedings of the ACM/SIGDA international sympo-
sium on Field programmable gate arrays (FPGA ’13),
pages 274–274, New York, NY, USA. ACM.
Duits, R. and Franken, E. M. (2010a). Left invariant
parabolic evolution equations on SE(2) and contour
enhancement via invertible orientation scores, part I:
Linear left-invariant diffusion equations on SE(2).
Quarterly of Applied mathematics, AMS, 68:255–292.
Duits, R. and Franken, E. M. (2010b). Left invariant
parabolic evolution equations on SE(2) and contour
enhancement via invertible orientation scores, part I:
Nonlinear left-invariant diffusion equations on invert-
ible orientation scores. Quarterly of Applied mathe-
matics, AMS, 68:293–331.
Kalitzin, S. N., Romeny, B. M. H., and Viergever, M. A.
(1999). Invertible apertured orientation filters in im-
age analysis. IJCV, 31:145–158.
Khronos (2011). OpenCL - the open standard for paral-
lel programming of heterogeneous systems. Khronos
Group.
Kovesi, P. (1999). Image features from phase congruency.
Videre: A Journal of Computer Vision Research, 1.
Lowe, D. (2004). Distinctive image features from scale in-
variant keypoints. International Journal on Computer
Vision, 60:91–110.
Tsuchiyama, R., Nakamura, T., Lizuka, T., Asahara, A.,
and Miki, S. (2009). The OpenCL Programming book.
Fixstars.
Tyrrell, J., Mahadevan, V., Tong, R., Brown, E., R.K., R. J.,
and Roysam, B. (2005). 2-d/3-d model-based method
to quantify the complexity of microvasculature im-
aged by in vivo multiphoton microscopy. Microvas-
cular Research, 70:165–178.
TheBattleoftheGiants-ACaseStudyofGPUvsFPGAOptimisationforReal-timeImageProcessing
119