loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Javier Jorge 1 ; Germán Moltó 1 ; Damian Segrelles 1 ; João Pedro Fontes 2 and Miguel Angel Guevara 2

Affiliations: 1 Instituto de Instrumentación para Imagen Molecular (I3M), Centro Mixto CSIC, Universitat Politècnica de València, Camino de Vera s/n, 46022, Valencia, Spain ; 2 Computer Graphics Center, University of Minho, Campus de Azurém, Guimarães, Portugal

Keyword(s): Cloud Computing, Deep Learning, Multi-cloud.

Abstract: This paper introduces a platform based on open-source tools to automatically deploy and provision a distributed set of nodes that conduct the training of a deep learning model. To this end, the deep learning framework TensorFlow will be used, as well as the Infrastructure Manager service to deploy complex infrastructures programmatically. The provisioned infrastructure addresses: data handling, model training using these data, and the persistence of the trained model. For this purpose, public Cloud platforms such as Amazon Web Services (AWS) and General-Purpose Computing on Graphics Processing Units (GPGPU) are employed to dynamically and efficiently perform the workflow of tasks related to training deep learning models. This approach has been applied to real-world use cases to compare local training versus distributed training on the Cloud. The results indicate that the dynamic provisioning of GPU-enabled distributed virtual clusters in the Cloud introduces great flexibility to cost -effectively train deep learning models. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.117.168.71

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Jorge, J.; Moltó, G.; Segrelles, D.; Fontes, J. and Guevara, M. (2021). Deployment Service for Scalable Distributed Deep Learning Training on Multiple Clouds. In Proceedings of the 11th International Conference on Cloud Computing and Services Science - CLOSER; ISBN 978-989-758-510-4; ISSN 2184-5042, SciTePress, pages 135-142. DOI: 10.5220/0010359601350142

@conference{closer21,
author={Javier Jorge. and Germán Moltó. and Damian Segrelles. and João Pedro Fontes. and Miguel Angel Guevara.},
title={Deployment Service for Scalable Distributed Deep Learning Training on Multiple Clouds},
booktitle={Proceedings of the 11th International Conference on Cloud Computing and Services Science - CLOSER},
year={2021},
pages={135-142},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010359601350142},
isbn={978-989-758-510-4},
issn={2184-5042},
}

TY - CONF

JO - Proceedings of the 11th International Conference on Cloud Computing and Services Science - CLOSER
TI - Deployment Service for Scalable Distributed Deep Learning Training on Multiple Clouds
SN - 978-989-758-510-4
IS - 2184-5042
AU - Jorge, J.
AU - Moltó, G.
AU - Segrelles, D.
AU - Fontes, J.
AU - Guevara, M.
PY - 2021
SP - 135
EP - 142
DO - 10.5220/0010359601350142
PB - SciTePress