choosing the right embedding for the first one. It is
well known that the perfect ML model does not exist
because each one possesses its own peculiar charac-
teristics. In our case the EM algorithm is fast and
the best GMM model obtained on the current ACEA
data set has a low computational complexity in terms
of the number of components, working with a shared
covariance matrix. As concerns the OCC System, it
reaches very good results in terms of accuracy, yield-
ing also classification models characterized by a low
number of clusters, even if the evolutionary procedure
slows down the training process. It is the price for ob-
taining a robust model where the weights of the cus-
tom based dissimilarity measures can be also inter-
preted as the importance of each feature in the clas-
sification task. This interesting feature, together with
clusters content analysis, allows knowledge discov-
ery applications. Moreover, some applications require
calibrated probabilities as output scores and both the
compared techniques show a weak calibration degree.
Future works will be grounded on the study and on the
application of several classical and newly proposed
calibration techniques for OCC System output scores,
as requested by the objectives of the main project.
ACKNOWLEDGEMENTS
The authors wish to thank ACEA Distribuzione
S.p.A. for providing the data and for their continu-
ous support during the design and test phases. Spe-
cial thanks to Ing. Stefano Liotta, Chief Network Op-
eration Division, to Ing. Silvio Alessandroni, Chief
Electric Power Distribution, and to Ing. Maurizio
Paschero, Chief Remote Control Division.
REFERENCES
ACEA (2014). The acea smart grid pilot project (in italian).
Akaike, H. (1974). A new look at the statistical model
identification. In Selected Papers of Hirotugu Akaike,
pages 215–222. Springer.
Bella, A., Ferri, C., Hern
´
andez-Orallo, J., and Ram
´
ırez-
Quintana, M. J. (2010). Calibration of machine learn-
ing models. In Handbook of Research on Machine
Learning Applications and Trends: Algorithms, Meth-
ods, and Techniques, pages 128–146. IGI Global.
Bellet, A., Habrard, A., and Sebban, M. (2013). A survey
on metric learning for feature vectors and structured
data. CoRR, abs/1306.6709.
Bianchi, F., De Santis, E., Rizzi, A., and Sadeghian, A.
(2015). Short-term electric load forecasting using
echo state networks and pca decomposition. Access,
IEEE, 3:1931–1943.
Brier, G. W. (1950). Verification of forecast expressed
in terms of probability. Monthly Weather Review,
78(1):1–3.
Cai, Y. and Chow, M.-Y. (2009). Exploratory analysis of
massive data for distribution fault diagnosis in smart
grids. In 2009 IEEE Power & Energy Society General
Meeting, pages 1–6. IEEE.
Cortes, C. and Vapnik, V. (1995). Support-vector networks.
Machine learning, 20(3):273–297.
De Santis, E., Livi, L., Sadeghian, A., and Rizzi, A. (2015).
Modeling and recognition of smart grid faults by a
combined approach of dissimilarity learning and one-
class classification. Neurocomputing, 170:368 – 383.
De Santis, E., Martino, A., Rizzi, A., and Mascioli, F. M. F.
(2018a). Dissimilarity space representations and auto-
matic feature selection for protein function prediction.
In 2018 International Joint Conference on Neural Net-
works (IJCNN), pages 1–8. IEEE.
De Santis, E., Paschero, M., Rizzi, A., and Mascioli,
F. M. F. (2018b). Evolutionary optimization of
an affine model for vulnerability characterization in
smart grids. In 2018 International Joint Conference
on Neural Networks (IJCNN), pages 1–8. IEEE.
De Santis, E., Rizzi, A., and Sadeghian, A. (2017a). A
learning intelligent system for classification and char-
acterization of localized faults in smart grids. In 2017
IEEE Congress on Evolutionary Computation (CEC),
pages 2669–2676.
De Santis, E., Rizzi, A., and Sadeghian, A. (2018c). A
cluster-based dissimilarity learning approach for lo-
calized fault classification in smart grids. Swarm and
evolutionary computation, 39:267–278.
De Santis, E., Rizzi, A., Sadeghian, A., and Mascioli, F.
(2013). Genetic optimization of a fuzzy control sys-
tem for energy flow management in micro-grids. In
IFSA World Congress and NAFIPS Annual Meeting
(IFSA/NAFIPS), 2013 Joint, pages 418–423.
De Santis, E., Sadeghian, A., and Rizzi, A. (2017b). A
smoothing technique for the multifractal analysis of a
medium voltage feeders electric current. International
Journal of Bifurcation and Chaos, 27(14):1750211.
DeGroot, M. H. and Fienberg, S. E. (1983). The com-
parison and evaluation of forecasters. Journal of the
Royal Statistical Society. Series D (The Statistician),
32(1/2):12–22.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977).
Maximum likelihood from incomplete data via the em
algorithm. Journal of the Royal Statistical Society:
Series B (Methodological), 39(1):1–22.
Duin, R. P., Pe¸kalska, E., and Loog, M. (2013). Non-
euclidean dissimilarities: causes, embedding and in-
formativeness. In Similarity-Based Pattern Analysis
and Recognition, pages 13–44. Springer.
Guikema, S. D., Davidson, R. A., and Liu, H. (2006). Statis-
tical models of the effects of tree trimming on power
system outages. IEEE Transactions on Power Deliv-
ery, 21(3):1549–1557.
Khan, S. S. and Madden, M. G. (2010). A survey of recent
trends in one class classification. In Coyle, L. and
CI4EMS 2020 - Special Session on Computational Intelligence for Energy Management and Storage
510