This work particularly focus on one of the core
components of the architecture: the model training
module. To instantiate technologically such model,
we first analyzed the characteristics of five open-
source AutoML tools (Auto-Keras, Auto-Sklearn,
Auto-Weka, H2O AutoML and TransmogrifAI).
Then, we performed a benchmark experimental study
with the two tools that presented a distributed ML
capability: H2O AutoML and TransmogrifAI. The
experiments were conducted using three real-world
datasets provided by the software company (churn,
event forecasting and fraud detection). The obtained
results allowed us to evaluate the potential of both Au-
toML technologies for the model training module of
the proposed architecture.
Overall, the proposed framework received a pos-
itive feedback from the software company, which
opted to select the H2O AutoML tool for its model
training module. In future work, additional telecom-
munications datasets will be addressed, in order to
further benchmark the AutoML tools. In particular,
we wish to extend the framework ML capabilities to
handle more ML tasks (e.g., ordinal classification,
multi-target regression). Moreover, we intend to fo-
cus the development on the remaining components of
the architecture, in order to select the best technolo-
gies to be used (e.g., for handling missing data).
This work was executed under the project IRMDA
- Intelligent Risk Management for the Digital Age,
Individual Project, NUP: POCI-01-0247-FEDER-
038526, co-funded by the Incentive System for Re-
search and Technological Development, from the
Thematic Operational Program Competitiveness of
the national framework program - Portugal2020.
An Automated and Distributed Machine Learning Framework for Telecommunications Risk Management