art in artificial neural network applications: A survey.
Heliyon, 4(11):e00938.
Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J.,
Battenberg, E., Case, C., Casper, J., Catanzaro, B.,
Cheng, Q., Chen, G., et al. (2016). Deep speech 2:
End-to-end speech recognition in english and man-
darin. In International conference on machine learn-
ing, pages 173–182. PMLR.
Ghosh, R., Ghosh, K., and Maitra, S. (2017). Automatic
detection and classification of diabetic retinopathy
stages using cnn. In 2017 4th International Confer-
ence on Signal Processing and Integrated Networks
(SPIN), pages 550–554. IEEE.
Google (2023a). Cloud tpu system architecture. [Accessed
Jan. 23, 2023].
Google (2023b). Xla: Google’s accelerated linear algebra
library. [Accessed Jan. 26, 2023].
Hatcher, W. G. and Yu, W. (2018). A survey of deep learn-
ing: Platforms, applications and emerging research
trends. IEEE Access, 6:24411–24432.
Hestness, J., Ardalani, N., and Diamos, G. (2019). Beyond
human-level accuracy: Computational challenges in
deep learning. In Proceedings of the 24th Symposium
on Principles and Practice of Parallel Programming,
pages 1–14.
Janghorbani, M., Jones, R. B., and Allison, S. P. (2000). In-
cidence of and risk factors for proliferative retinopa-
thy and its association with blindness among diabetes
clinic attenders. Ophthalmic Epidemiology, 7(4):225–
241.
Jouppi, N. P., Young, C., Patil, N., Patterson, D., Agrawal,
G., Bajwa, R., Bates, S., Bhatia, S., Boden, N.,
Borchers, A., et al. (2017). In-datacenter performance
analysis of a tensor processing unit. In Proceedings of
the 44th annual international symposium on computer
architecture, pages 1–12.
Keskar, N. S., Mudigere, D., Nocedal, J., Smelyanskiy, M.,
and Tang, P. T. P. (2016). On large-batch training for
deep learning: Generalization gap and sharp minima.
arXiv preprint arXiv:1609.04836.
Kim, T. K. (2015). T test as a parametric statistic. Korean
journal of anesthesiology, 68(6):540.
K
¨
unas, C. A., Serpa, M. S., Bez, J. L., Padoin, E. L., and
Navaux, P. O. (2021). Offloading the training of an i/o
access pattern detector to the cloud. In 2021 Interna-
tional Symposium on Computer Architecture and High
Performance Computing Workshops (SBAC-PADW),
pages 15–19. IEEE.
Lin, G.-M., Chen, M.-J., Yeh, C.-H., Lin, Y.-Y., Kuo, H.-Y.,
Lin, M.-H., Chen, M.-C., Lin, S. D., Gao, Y., Ran, A.,
et al. (2018). Transforming retinal photographs to en-
tropy images in deep learning to improve automated
detection for diabetic retinopathy. Journal of ophthal-
mology, 2018.
Mohammadian, S., Karsaz, A., and Roshan, Y. M. (2017).
Comparative study of fine-tuning of pre-trained con-
volutional neural networks for diabetic retinopathy
screening. In 2017 24th National and 2nd Interna-
tional Iranian Conference on Biomedical Engineering
(ICBME), pages 1–6. IEEE.
Network, S. I. G. (2010). Management of obesity: a
national clinical guideline. Scottish Intercollegiate
Guidelines Network: Edinburgh, 20.
Roloff, E. (2013). Viability and performance of high-
performance computing in the cloud. Master’s thesis,
Federal University of Rio Grande do Sul.
Stitt, A. W., Curtis, T. M., Chen, M., Medina, R. J., McKay,
G. J., Jenkins, A., Gardiner, T. A., Lyons, T. J.,
Hammes, H.-P., Simo, R., et al. (2016). The progress
in understanding and treatment of diabetic retinopa-
thy. Progress in retinal and eye research, 51:156–186.
Sun, C., Shrivastava, A., Singh, S., and Gupta, A. (2017).
Revisiting unreasonable effectiveness of data in deep
learning era. In Proceedings of the IEEE international
conference on computer vision, pages 843–852.
Voets, M., Møllersen, K., and Bongo, L. A. (2019). Repro-
duction study using public data of: Development and
validation of a deep learning algorithm for detection
of diabetic retinopathy in retinal fundus photographs.
PloS one, 14(6):e0217541.
Wang, Y. E., Wei, G.-Y., and Brooks, D. (2019a). Bench-
marking tpu, gpu, and cpu platforms for deep learning.
arXiv preprint arXiv:1907.10701.
Wang, Z., Liu, K., Li, J., Zhu, Y., and Zhang, Y. (2019b).
Various frameworks and libraries of machine learning
and deep learning: a survey. Archives of computa-
tional methods in engineering, pages 1–24.
Wilkinson, C., Ferris III, F. L., Klein, R. E., Lee, P. P.,
Agardh, C. D., Davis, M., Dills, D., Kampik, A.,
Pararajasegaram, R., Verdaguer, J. T., et al. (2003).
Proposed international clinical diabetic retinopathy
and diabetic macular edema disease severity scales.
Ophthalmology, 110(9):1677–1682.
Wongpanich, A., Pham, H., Demmel, J., Tan, M., Le, Q.,
You, Y., and Kumar, S. (2021). Training efficient-
nets at supercomputer scale: 83% imagenet top-1 ac-
curacy in one hour. In 2021 IEEE International Paral-
lel and Distributed Processing Symposium Workshops
(IPDPSW), pages 947–950. IEEE.
Ying, C., Kumar, S., Chen, D., Wang, T., and Cheng, Y.
(2018). Image classification at supercomputer scale.
arXiv preprint arXiv:1811.06992.
You, Y., Li, J., Reddi, S., Hseu, J., Kumar, S., Bhojanapalli,
S., Song, X., Demmel, J., Keutzer, K., and Hsieh, C.-
J. (2019a). Large batch optimization for deep learn-
ing: Training bert in 76 minutes. arXiv preprint
arXiv:1904.00962.
You, Y., Zhang, Z., Hsieh, C.-J., Demmel, J., and Keutzer,
K. (2018). Imagenet training in minutes. In Proceed-
ings of the 47th International Conference on Parallel
Processing, pages 1–10.
You, Y., Zhang, Z., Hsieh, C.-J., Demmel, J., and Keutzer,
K. (2019b). Fast deep neural network training on dis-
tributed systems and cloud tpus. IEEE Transactions
on Parallel and Distributed Systems, 30(11):2449–
2462.
Zheng, Y., Ley, S. H., and Hu, F. B. (2018). Global ae-
tiology and epidemiology of type 2 diabetes mellitus
and its complications. Nature reviews endocrinology,
14(2):88–98.
Accelerating Deep Learning Model Training on Cloud Tensor Processing Unit
323