proposed SIMO method could improve accuracy and
f-measure performance with synthetic instance opti-
misation over standalone SMAC optimisation during
empirical experiments ranging in a 24-hour tempo-
ral parameter setting. The expectation from results
using proposed SIMO approach is that performance
improvement should be evident with each synthetic
instance increment in comparison with optimised col-
lective sample performance. This should be replicated
with both accuracy and f-measure metrics in the main
with some potential individual exceptions that will be
identified for further analysis as is the case in Ta-
ble 2 for hour 3. In Figures 5 and 6, some of these
results are presented affirming the initial prediction
both in terms of accuracy and f-measure metrics. This
validation has rendered the SIMO model an effec-
tive performance enhancing model suitable for use
when applying optimisation approaches to a sparse
training sample. There are however some anomalous
results with a number of poor optimisation perfor-
mances observed when applied over a longer period
of time compared with shorter experiment durations.
These results require further investigation as to why
performance was so poor with certain parameter set-
tings and what the optimum classifiers from the poor-
est performing years were for future considerations.
6 CONCLUSIONS
In this study a novel SIMO method was presented us-
ing a hybrid approach incorporating SMAC optimsa-
tion and SMOTE instance generation with the aim of
evaluating and assessing optimal instance generation
volume and parameter settings for optimised classi-
fication. In summary, current findings have identified
optimal parameter settings and classifiers for a range
of duration intervals providing a knowledge base for
future optimisation experiments in this field. Individ-
ual classifier performance can now be correctly dis-
tinguished as that which performs best with reduced
or increased optimisation time periods. This informa-
tion is indicative of each algorithm’s potential for suit-
ability with more machine intensive problems such
as deep structured learning studies and can eliminate
certain algorithms from future SIMO prediction train-
ing. In each of the examples in Figure 5 and 6, there
is evidence of increasing accuracy and f-measure per-
formance in the majority of cases which is positive
for insight when building future predictive models.
Another example is shown in Figure 10(a) with 13
hours of optimisation providing an average increase
of 27.2% on standard SMAC implementation with 9
out of 24 configurations having more than 15% av-
erage accuracy increase. In terms of f-measure, 25%
of average configuration increases resulted in more
than 2.5 f-measure improvement with total average
increase across all results of 0.18 and a high aver-
age increase of 3.4 when optimising for 4 hours with
0.43 increases in some cases. In Table 2, results show
high frequency of certain algorithms such as OneR
providing optimum performance in 5 of the 24-hour
intervals with other similar regularity from Random
Tree and Logistic Regression providing most accurate
performance levels with accuracy in the high 90%
range. These classifiers indicate optimal suitability for
use with this research problem and provide a basis
for future baseline experiments with the novel SIMO
model. The analysis factors that require further as-
sessment based on all results will contrast the ex-
ploration and exploitation benefits that is, determin-
ing which performance level provides the greatest im-
provement while remaining efficient and maintaining
data authenticity. The next phase of validating this
method will involve empirical evaluation of alterna-
tive sampling methods and representative datasets for
comparative performance analysis.
ACKNOWLEDGEMENTS
Training data is supplied by research partners at The
European Space Agency in conjunction with The
Academy of Opto-electronics, CAS, China.
REFERENCES
Bengio, Y., Goodfellow, I. J., and Courville, A. (2015). Op-
timization for training deep models. Deep Learning,
pages 238–290.
Chan, S., Treleaven, P., and Capra, L. (2013). Continu-
ous hyperparameter optimization for large-scale rec-
ommender systems. In Proceedings - 2013 IEEE In-
ternational Conference on Big Data, Big Data 2013,
pages 350–358.
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer,
W. P. (2002). SMOTE: synthetic minority over-
sampling technique. Journal of Artificial Intelligence
Research, pages 321–357.
Corne, D. W. and Reynolds, A. P. (2010). Optimisation and
generalisation: Footprints in instance space. In Lec-
ture Notes in Computer Science, volume 6238 LNCS,
pages 22–31.
Feurer, M., Springenberg, J. T., and Hutter, F. (2015). Ini-
tializing Bayesian Hyperparameter Optimization via
Meta-Learning. Proceedings of the 29th Conference
on Artificial Intelligence (AAAI 2015), pages 1128–
1135.
DATA 2018 - 7th International Conference on Data Science, Technology and Applications
102