rithms and the results will be compared. So the pur-
pose of this article is to describe a platform that allows
a user to select, tune and pipeline multiple AI algo-
rithms as well as preprocessing and displaying tasks.
All these operations may be done without writing any
code and without programming skills. This tool is
very useful in the early stage of a project when the
feasibility of a given approach must be measured or
decided. The tool will give an estimate of the qual-
ity level that may be obtain using different AI algo-
rithms. This will reduce the time to production and
it will increase the chance for a successful implemen-
tation. Furthermore, we will introduce a step-by-step
process for choosing the right sequence of algorithms.
The rest of the paper is organized as follows. Sec-
tion 2 shows a brief description of related works. Sec-
tion 3 contains the theoretical classification of differ-
ent algorithms and the decision tree which can be used
to choose the proper algorithm for a given problem.
The architecture of the platform is described in Sec-
tion 4. Experiments can be found in Section 5 and
Section 6 concludes the article.
2 RELATED WORK
Choosing the best algorithm for a specific problem
domain, taking into consideration multiple factors
(e.g. the type of the data involved, the nature of the
problem etc.) is a general problem in developing in-
telligent products. There are multiple articles which
are trying to classify AI algorithms, based on different
points of view.
In (Dasgupta and Nath, 2016) we can find a typical
classification of machine learn algorithms, based on
the nature of the training data as follows:
- supervised, if the training data is labeled;
- unsupervised, if there are no labels;
- semi-supervised, if some of the class labels are
missing.
The problem with this classification is the fact that
it considers only the nature of the training data, with-
out taking in consideration the context of the problem.
This approach reduces the searching space, but each
class has too many algorithms to be considered, tested
and evaluated. Other articles describes specific prob-
lems, but only in the context of classification (Ilias
et al., 2007), regression (Gulden and Nese, 2013) or
only in case of one specific algorithm. There are lots
of articles focusing on a specific problem and only
one specific algorithm to resolve that problem. For
example, in the case of anomaly detection, there are
a plenty of articles discussing different scenarios and
comparing results of different algorithms.
In (Zareapoor et al., 2012) the authors make a
comprehensive evaluation of the most popular algo-
rithms used in the context of credit card fraud detec-
tion. This article contains a brief description of al-
gorithms like Bayesian Networks, Neural Networks,
SVM, etc. and at the end of the article there is a ta-
ble comparing the results, using metrics like accuracy,
speed and cost. Other types of articles has a differ-
ent approach, creating a survey of all the algorithms
which can be used for a specific problem.
In (Varun et al., 2009) there is a survey of all the
algorithms which can be used to effectively detect
anomalies. This article doesn’t narrow down its view
to only supervised or only unsupervised algorithms.
On the contrary, the authors describe an extensive list
of algorithms, which can be used in a larger context,
in anomaly detection. There are plenty of research pa-
pers comparing two or more different algorithms, as
in (Juan et al., 2004) (Murad et al., 2013) (Agrawal
and Agrawal, 2016) (Gupta et al., 2014).
The papers mentioned above were trying to help
engineers in finding the best algorithm for specific
problems, but none of them managed to create a step-
by-step process, a useful methodology on how to
choose the right algorithm for any type of problem
and any type of dataset. In this paper we will pro-
pose a step-by-step methodology that can be applied
to choose an algorithm. Moreover we will present our
platform, which can be used in order to evaluate the
chosen algorithms.
3 THE PROPOSED
METHODOLOGY FOR
CHOOSING AI ALGORITHMS
There are no universally good or bad algorithms, each
algorithm is specific to a context, a problem or a type
of dataset. This idea was demonstrated by David H.
Wolpert and William G. Macready in (Wolpert and
Macready, 1997) in the so called "No Free Lunch
Theorems for Optimization". The NFLT are a set of
mathematical proofs and general framework that ex-
plores the connection between general-purpose algo-
rithms that are considered “black-box” and the prob-
lems they solve. This states that any algorithm that
searches for an optimal cost or fitness solution is not
universally superior to any other algorithm. Wolpert
and Macready wrote in their paper that "If an algo-
rithm performs better than random search on some
class of problems then it must perform worse than
random search on the remaining problems."
In the real world, we need to decide on engineer-
IJCCI 2018 - 10th International Joint Conference on Computational Intelligence
40