REFERENCES
Avnet (2019). Ultra96 - 96boards. https://www.96boards.
org/product/ultra96/. (Accessed on 04/29/2019).
Black, D. C., Donovan, J., Bunton, B., and Keist, A. (2009).
SystemC: From the ground up, volume 71. Springer
Science & Business Media.
Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran,
J., Catanzaro, B., and Shelhamer, E. (2014). cudnn:
Efficient primitives for deep learning. arXiv preprint
arXiv:1410.0759.
Del Sozzo, E., Di Tucci, L., and Santambrogio, M. D.
(2017). A highly scalable and efficient parallel design
of n-body simulation on fpga. In 2017 IEEE Interna-
tional Parallel and DistributedProcessing Symposium
Workshops (IPDPSW), pages 241–246. IEEE.
El Adawy, M., Kamaleldin, A., Mostafa, H., and Said, S.
(2017). Performance evaluation of turbo encoder im-
plementation on a heterogeneous fpga-cpu platform
using sdsoc. In 2017 Intl Conf on Advanced Control
Circuits Systems (ACCS) Systems & 2017 Intl Conf on
New Paradigms in Electronics & Information Technol-
ogy (PEIT), pages 286–290. IEEE.
Gajski, D. D., Dutt, N. D., Wu, A. C., and Lin, S. Y. (2012).
High—Level Synthesis: Introduction to Chip and Sys-
tem Design. Springer Science & Business Media.
Gomes, T., Pinto, S., Tavares, A., and Cabral, J. (2015).
Towards an fpga-based edge device for the internet of
things. In 2015 IEEE 20th Conference on Emerging
Technologies & Factory Automation (ETFA), pages 1–
4. IEEE.
Kathail, V., Hwang, J., Sun, W., Chobe, Y., Shui, T., and
Carrillo, J. (2016). Sdsoc: A higher-level program-
ming environment for zynq soc and ultrascale+ mp-
soc. In Proceedings of the 2016 ACM/SIGDA Interna-
tional Symposium on Field-Programmable Gate Ar-
rays, pages 4–4. ACM.
K&F (2015). K & f computing research. http://www.kfcr.jp/
grape9.html. (Accessed on 04/29/2019).
Khronos (2019). Opencl overview - the khronos group
inc. https://www.khronos.org/opencl/. (Accessed on
04/29/2019).
Kowalczyk, M., Przewlocka, D., and Krvjak, T. (2018).
Real-time implementation of contextual image pro-
cessing operations for 4k video stream in zynq ultra-
scale+ mpsoc. In 2018 Conference on Design and Ar-
chitectures for Signal and Image Processing (DASIP),
pages 37–42. IEEE.
Luebke, D., Harris, M., Govindaraju, N., Lefohn, A., Hous-
ton, M., Owens, J., Segal, M., Papakipos, M., and
Buck, I. (2006). Gpgpu: general-purpose computa-
tion on graphics hardware. In Proceedings of the 2006
ACM/IEEE Conference on Supercomputing, page 208.
ACM.
Luo, L., Wu, Y., Qiao, F., Yang, Y., Wei, Q., Zhou, X.,
Fan, Y., Xu, S., Liu, X., and Yang, H. (2018). Design
of FPGA-Based Accelerator for Convolutional Neu-
ral Network under Heterogeneous Computing Frame-
work with OpenCL. International Journal of Recon-
figurable Computing.
Mousouliotis, P. G. and Petrou, L. P. (2019). Software-
defined FPGA accelerator design for mobile deep
learning applications. CoRR, abs/1902.03192.
Peng, B., Wang, T., Jin, X., and Wang, C. (2016). An Accel-
erating Solution for N-Body MOND Simulation with
FPGA-SoC. International Journal of Reconfigurable
Computing.
Rettkowski, J., Boutros, A., and G¨ohringer, D. (2017).
Hw/sw co-design of the hog algorithm on a xilinx
zynq soc. Journal of Parallel and Distributed Com-
puting, 109:50–62.
Roh, S.-D., Cho, K., and Chung, K.-S. (2016). Implemen-
tation of an ldpc decoder on a heterogeneous fpga-cpu
platform using sdsoc. In 2016 IEEE Region 10 Con-
ference (TENCON), pages 2555–2558. IEEE.
Sato, T. and Narumi, T. (2015). Acceleration of othello
computer game using an fpga tablet. In 2015 Third
International Symposium on Computing and Network-
ing (CANDAR), pages 581–584. IEEE.
Shi, W. and Dustdar, S. (2016). The promise of edge com-
puting. Computer, 49(5):78–81.
Ukawa, H. and Narumi, T. (2015). Acceleration of the Fast
Multipole Method on FPGA Devices. IEICE Transac-
tions on Information and Systems, E98D(2):309–312.
Waidyasooriya, H. M., Endo, T., Hariyama, M., and Ohtera,
Y. (2017). Opencl-based fpga accelerator for 3d
fdtd with periodic and absorbing boundary conditions.
International Journal of Reconfigurable Computing,
2017.
Xilinx (2019a). Sdsoc development environment. https://
www.xilinx.com/products/design-tools/software-
zone/sdsoc.html. (Accessed on 04/10/2019).
Xilinx (2019b). Vivado high-level synthesis.
https://www.xilinx.com/products/design-tools/
vivado/integration/esl-design.html. (Accessed on
04/29/2019).
Zeng, S., Guo, K., Fang, S., Kang, J., Xie, D., Shan, Y.,
Wang, Y., and Yang, H. (2018). An efficient reconfig-
urable framework for general purpose cnn-rnn models
on fpgas. pages 1–5.