loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Abdallah Salama ; Alexander Linke ; Igor Pessoa Rocha and Carsten Binnig

Affiliation: Data Management Lab, TU Darmstadt and Germany

Keyword(s): Distributed Deep Learning, Machine Learning, Cloud Computing, Scalability.

Abstract: A major obstacle for the adoption of deep neural networks (DNNs) is that the training can take multiple hours or days even with modern GPUs. In order to speed-up training of modern DNNs, recent deep learning frameworks support the distribution of the training process across multiple machines in a cluster of nodes. However, even if existing well-established models such as AlexNet or GoogleNet are being used, it is still a challenging task for data scientists to scale-out distributed deep learning in their environments and on their hardware resources. In this paper, we present XAI, a middleware on top of existing deep learning frameworks such as MXNet and Tensorflow to easily scale-out distributed training of DNNs. The aim of XAI is that data scientists can use a simple interface to specify the model that needs to be trained and the resources available (e.g., number of machines, number of GPUs per machine, etc.). At the core of XAI, we have implemented a distributed optimizer that take s the model and the available cluster resources as input and finds a distributed setup of the training for the given model that best leverages the available resources. Our experiments show that XAI converges to a desired training accuracy 2x to 5x faster than default distribution setups in MXNet and TensorFlow. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.137.161.222

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Salama, A.; Linke, A.; Rocha, I. and Binnig, C. (2019). XAI: A Middleware for Scalable AI. In Proceedings of the 8th International Conference on Data Science, Technology and Applications - DATA; ISBN 978-989-758-377-3; ISSN 2184-285X, SciTePress, pages 109-120. DOI: 10.5220/0008120301090120

@conference{data19,
author={Abdallah Salama. and Alexander Linke. and Igor Pessoa Rocha. and Carsten Binnig.},
title={XAI: A Middleware for Scalable AI},
booktitle={Proceedings of the 8th International Conference on Data Science, Technology and Applications - DATA},
year={2019},
pages={109-120},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008120301090120},
isbn={978-989-758-377-3},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 8th International Conference on Data Science, Technology and Applications - DATA
TI - XAI: A Middleware for Scalable AI
SN - 978-989-758-377-3
IS - 2184-285X
AU - Salama, A.
AU - Linke, A.
AU - Rocha, I.
AU - Binnig, C.
PY - 2019
SP - 109
EP - 120
DO - 10.5220/0008120301090120
PB - SciTePress