IMPROVING ELECTRIC FRAUD DETECTION USING CLASS IMBALANCE STRATEGIES

Matías Di Martino, Federico Decia, Juan Molinelli, Alicia Fernández

Abstract

Improving nontechnical loss detection is a huge challenge for electric companies. The great number of clients and the diversity of the different types of fraud makes this a very complex task. In this paper we present a fraud detection strategy based on class imbalance research. An automatic detection tool combining classification strategies is proposed. Individual classifiers such as One Class SVM, Cost Sensitive SVM (CS-SVM), Optimum Path Forest (OPF) and C4.5 Tree, and combination functions are designed taken special care in the data’s class imbalance nature. Analysis over consumers historical kWh load profile data from Uruguayan Electric Company (UTE) shows that using combination and balancing techniques improves automatic detection performance.

References

  1. Alcetegaray, D. and Kosut, J. (2008). One class svm para la detección de fraudes en el uso de energía eléctrica. Trabajo Final Curso de Reconocimiento de Patrones, Dictado por el IIE- Facultad de Ingeniería- UdelaR.
  2. Barandela, R. and Garcia, V. (2003). Strategies for learning in class imbalance problems. Pattern Recognition, pages 849-851.
  3. Batista, G., Pratti, R., and Monard, M. (2004). A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explorations 6, pages 20-29.
  4. Brown, G. and Kuncheva, L. (2010). ”good” and ”bad” diversity in majority vote ensembles. In Multiple Classifier Systems. Springer Berlin Heidelberg.
  5. Chang, C. and Lin, C. (2001). LIBSVM: a library for support vector machines.
  6. Chawla, N., Bowyer, K., and Hall, L. (2002). Smote: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research.
  7. Chawla, N., Lazarevic, A., and Hall, L. (2003). Smoteboost: impoving prediction of the minority class in boosting. European Conf. ok Principles and Practice of Knowledge Discovery in Databases.
  8. Chawla, N. and Sylvester, J. (2007). Exploiting diversity in ensembles: Improving the performance on unbalanced datasets. Departament of Computer Science and Engineering.
  9. Dash, M. and Liu, H. (1997). Feature selection for classification. Intelligent Data Analysis, 1:131-156.
  10. Dietterich, T. (2000). Ensemble methods in machine learning. Multiple Classifier Systems, volume 1857 of Lecture Notes in Computer Science.
  11. Duin, R. (2000). PRTools Version 3.0: A Matlab Toolbox for Pattern Recognition.
  12. Garcia, V., Sanchez, J., Mollineda, R., Alejo, R., and Sotoca, J. (2007). The class imbalance problem in pattern classification and learning. In Congreso Espaol de Informtica, Spain.
  13. Guo, X. and Zhou, G. (2008). On the class imbalance problem. IIE - Computer Society, 1:192.
  14. Jiang, R., Tagaris, H., and Laschusz, A. (2000). Wavelets based feature extraction and multiple cassifiers for electricity fraud detection.
  15. Kolez, A., Chowdhury, A., and Alspector, J. (2003). Data duplication: an imbalance problem? Proc. Proc. Intl. Conf. on Machine Learning, Workshop on Learning with Imbalanced Data Sets II.
  16. Kuncheva, L. (2004). Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience.
  17. Manning, C., Raghavan, P., and Schutze, H. (2009). An Introduction to Information Retrival. Cambridge University Press, Cambridge, England, 1 edition.
  18. Muniz, C., Vellasco, M., Tanscheit, R., and Figueiredo, K. (2009). Ifsa-eusflat 2009 a neuro-fuzzy system for fraud detection in electricity distribution.
  19. Nagi, J. and Mohamad, M. (2010). Nontechnical loss detection for metered customers in power utility using support vector machines. IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 25, NO. 2.
  20. Papa, J. and Falcao, A. (2010). Optimum-path forest: A novel and powerful framework for supervised graphbased pattern recognition techniques. Institute of Computing University of Campinas.
  21. Papa, J., Falcao, A., and C.Suzuki (2008). LibOPF: a library for Opthimum Path Forets.
  22. Papa, J., Falcao, A., Miranda, P., Suzuki, C., and Mascarenhas, N. (2007). Design of robust pattern classifiers based on optimum-path forests. 8th International Symposium on Mathematical Morphology Rio de Janeiro Brazil Oct, pages 337-348.
  23. Ramos, C., de Sousa, A. N., Papa, J., and Falcao, A. (2010). A new approach for nontechnical losses detection based on optimum-path forest. IEEE TRANSACTIONS ON POWER SYSTEMS.
  24. Scholkopf, B. and Smola, A. (2002). Learning with Kernels. The MIT Press, London, 2. edition.
  25. Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley.
  26. Wang, S. and Yao, X. (2009). Theoretical study of the relationship between diversity and single-class measures for class imbalance learning.
Download


Paper Citation


in Harvard Style

Di Martino M., Decia F., Molinelli J. and Fernández A. (2012). IMPROVING ELECTRIC FRAUD DETECTION USING CLASS IMBALANCE STRATEGIES . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM, ISBN 978-989-8425-99-7, pages 135-141. DOI: 10.5220/0003768401350141


in Bibtex Style

@conference{icpram12,
author={Matías Di Martino and Federico Decia and Juan Molinelli and Alicia Fernández},
title={IMPROVING ELECTRIC FRAUD DETECTION USING CLASS IMBALANCE STRATEGIES},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,},
year={2012},
pages={135-141},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003768401350141},
isbn={978-989-8425-99-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,
TI - IMPROVING ELECTRIC FRAUD DETECTION USING CLASS IMBALANCE STRATEGIES
SN - 978-989-8425-99-7
AU - Di Martino M.
AU - Decia F.
AU - Molinelli J.
AU - Fernández A.
PY - 2012
SP - 135
EP - 141
DO - 10.5220/0003768401350141