three possible set: training set, validating set and
testing set. This feature allows the analyst to define
the sets that will be used by the ensemble (meta)
method. The base model methods will use the same
sets which were defined for the ensemble. The
DMEE does that to avoid biasing for either a base
model method or for the ensemble method.
Figure 1: The DMEE general architecture.
2.2 Base Model Method Execution
and Selection
After selecting the time series and the dividing its
data into the appropriate sets for the ensemble, the
analyst should configure and execute the base model
methods chosen from the repository to be used in the
ensemble. The DMEE core functionality executes
each base model method each one at a time and it
stores the prediction results in a database. These
results will be used as input for the ensemble
combination method.
It is interesting to mention that, since the DMEE
stores a base model method result, this result can be
used anytime later. The execution of a method is
very time consuming and, in the current version of
the DMEE, there is no concurrency. The idea behind
this feature is to save time by trying to reuse the
results of a particular method execution (i.e. a
specific base model method configuration).
After executing the base model methods, the
analyst can verify their performance (based in
whichever criteria the analyst desires). To improve
the ensemble results, she can select which base
model methods will build the ensemble. The DMEE
provides five approaches for base model methods
selection: (i) all methods: the results from all the
base model methods executed will be used in the
ensemble combination method; (ii) individual
selection: the analyst can choose which base model
method results will be used in the ensemble
combination method; (iii) selection by maximum
index: the analyst indicates a maximum error value
for the base model method results. The DMEE will
select those base model methods that present a
prediction error smaller than the value provided by
the analyst; (iv) selection by percentage index: this
approach is similar to the above. The difference is
that the metric used to analyze the base model
method performance does not indicate an absolute
error, but rather a percentage error; (v) automatic
selection (based on simple averaging combination):
this approach allows the DMEE to automatically
select the set of base model method results to be
used in the ensemble by executing a series of
simulations to evaluate the combination of the
obtained results.
2.3 Combination Method Selection
and Execution
After selecting which base model methods will be
used in the ensemble, the analyst should select the
combination method. The DMEE provides two types
of combination methods: linear and non-linear. The
available linear combination methods are simple
averaging and weighted averaging. If the analyst
chooses to use non-linear combination, she can use
any method (from the DMEE repository) to combine
the base model method results. In the DMEE, the
analyst can configure two strategies for the
combination method: combination and training. The
idea is to let the analyst execute several possibilities
for the same ensemble method.
The combination strategies are: (i) simple
combination: only the results of the base model
methods are used as input for the combination
method; (ii) compound combination: besides the
results of the base model methods, the original time
series is used as input for the combination method.
The training strategies are: (i) single phase training:
the composition method uses only the training set;
(ii) two phase training: the composition method uses
both the training and the validation sets.
The DMEE executes the combination method
and stores its results for analysis purposes. The
execution of the combination method has a
particular issue when the base model methods use
the prediction window concept (e.g. regression
methods). The prediction windows of the base
model methods should be aligned to indicate the
initial time index to be used in the training phase of
the combination method. If the combination method
AN EXTENSIBLE ENSEMBLE ENVIRONMENT FOR TIME SERIES FORECASTING
405