XAI: A Middleware for Scalable AI

Abdallah Salama, Alexander Linke, Igor Rocha, Carsten Binnig

2019

Abstract

A major obstacle for the adoption of deep neural networks (DNNs) is that the training can take multiple hours or days even with modern GPUs. In order to speed-up training of modern DNNs, recent deep learning frameworks support the distribution of the training process across multiple machines in a cluster of nodes. However, even if existing well-established models such as AlexNet or GoogleNet are being used, it is still a challenging task for data scientists to scale-out distributed deep learning in their environments and on their hardware resources. In this paper, we present XAI, a middleware on top of existing deep learning frameworks such as MXNet and Tensorflow to easily scale-out distributed training of DNNs. The aim of XAI is that data scientists can use a simple interface to specify the model that needs to be trained and the resources available (e.g., number of machines, number of GPUs per machine, etc.). At the core of XAI, we have implemented a distributed optimizer that takes the model and the available cluster resources as input and finds a distributed setup of the training for the given model that best leverages the available resources. Our experiments show that XAI converges to a desired training accuracy 2x to 5x faster than default distribution setups in MXNet and TensorFlow.

Download


Paper Citation


in Harvard Style

Salama A., Linke A., Rocha I. and Binnig C. (2019). XAI: A Middleware for Scalable AI.In Proceedings of the 8th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-377-3, pages 109-120. DOI: 10.5220/0008120301090120


in Bibtex Style

@conference{data19,
author={Abdallah Salama and Alexander Linke and Igor Rocha and Carsten Binnig},
title={XAI: A Middleware for Scalable AI},
booktitle={Proceedings of the 8th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2019},
pages={109-120},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008120301090120},
isbn={978-989-758-377-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 8th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - XAI: A Middleware for Scalable AI
SN - 978-989-758-377-3
AU - Salama A.
AU - Linke A.
AU - Rocha I.
AU - Binnig C.
PY - 2019
SP - 109
EP - 120
DO - 10.5220/0008120301090120