Revealing Encrypted WebRTC Traffic via Machine Learning Tools

Mario Di Mauro, Maurizio Longo

Abstract

The detection of encrypted real-time traffic, both streaming and conversational, is an increasingly important issue for agencies in charge of lawful interception. Aside from well established technologies used in real-time communication (e.g. Skype, Facetime, Lync etc.) a new one is recently spreading: Web Real-Time Communication (WebRTC), which, with the support of a robust encryption method such as DTLS, offers capabilities for encrypted voice and video without the need of installing a specific application but using a common browser, like Chrome, Firefox or Opera. Encrypted WebRTC traffic cannot be recognized through methods of semantic recognition since it does not exhibit a discernible sequence of information pieces and hence statistical recognition methods are called for. In this paper we propose and evaluate a decision theory based system allowing to recognize encrypted WebRTC traffic by means of an open-source machine learning environment: Weka. Besides, a reasoned comparison among some of the most credited algorithms (J48, Simple Cart, Naïve Bayes, Random Forests) in the field of decision systems has been carried out, indicating the prevalence of Random Forests.

References

  1. Aggarwal, C., 2014. Data classification: algorithms and applications. Taylor & Francis Press.
  2. Aruna, S., Rajagopalan, S.P., Nandakishore, L.V., 2001. An Empirical comparison of Supervised learning algorithms in disease detection. In IJITCS, Vol1, No4.
  3. Baugher, M., McGrew, D., Naslund, M., Carrara, E., Norrman, K., 2004. IETF, RFC3711.
  4. Breiman, L., Friedman, J., Olshen, R., Stone, C., 1984. Classification and Regression Trees. Taylor & Francis Press.
  5. Breiman, L., 2001. Random Forests. In Machine Learning Journal, Vol. 45, N°1, pp. 5-32.
  6. Dainotti, A., De Donato, W., Pescape, A., Rossi, P.S., 2008. Classification of Network Traffic via PacketLevel Hidden Markov Models. In GLOBECOM'08, pp.1-5. IEEE.
  7. Di Mauro, M., Longo, M., 2014. Skype traffic detection: a decision theory based tool. In ICCST'14, pp. 52-57. IEEE.
  8. Esposito, F., Malerba, D., Semeraro, G., 1997. A comparative analysis of methods for pruning decision trees. In IEEE Transaction on Pattern analysis and machine intelligence, Vol. 19, No. 5.
  9. Freire, E. P., Ziviani, A., Salles R. M., 2008. Detecting Skype flows in Web traffic. In NOMS'08, pp. 89-96. IEEE.
  10. James, G., Witten, D., Hastie, T., Tibshirani, R., 2014. An introduction to statistical learning with applications in R. Springer.
  11. Kotsiantis, S. B., 2007. Supervised Machine Learning: A review of Classification Techniques. In Emerging Artificial Intelligence Applications in Computer Engineering pp.3-24. ACM.
  12. Kuhn, M., Johnson, K., 2013. Applied predictive modeling. Springer.
  13. Lennox, J., 2013. IETF, RFC6904.
  14. Lin, P., Lei, Z., Chen, L., Yang, J., Liu, F., 2009. Decision tree network traffic classifier via adaptive hierarchical clustering for imperfect training dataset. In WiCOM'09, pp. 1-6. IEEE.
  15. Loreto, S., Romano, S.P., 2014. Real-time communication with WebRTC. O'Reilly Media.
  16. Mahy, R., Matthews, P., Rosenberg, J., 2010. IETF, RFC5766.
  17. Meera Gandhi G., 2010. Machine Learning Approach for Attack Prediction and Classification using Supervised Learning Algorithms. In IJCSNS pp. 247-250.
  18. Moore, A. W., Zuev, D., 2005. Internet traffic classification using Bayesian analysis techniques. In SIGMETRICS'05, pp. 50-60. ACM Press.
  19. Nguyen, T. T., Armitage, G., 2008. A survey of techniques for Internet traffic classification using machine learning. In Communications Surveys & Tutorials, Vol. 10, Issue: 4, pp. 56-76. IEEE.
  20. Quinlan, J. R., 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.
  21. Rahdari, F., Eftekhari, M., 2012. Using Bayesian classifiers for estimating quality of VoIP. AISP12, pp. 348-353.
  22. Rescorla, E., Modadugu, N., 2006. Datagram Transport Layer Security. IETF, RFC4347.
  23. Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., Schooler, E., 2002. IETF, RFC 3261.
  24. Rosenberg, J., Mahy, R., Matthews, P., Windg, D., 2008. IETF, RFC5389.
  25. Ruggieri, S., 2002. Efficient C4.5. In IEEE Transactions on Knowledge and Data Engineering. Vol. 14, Issue: 2. IEEE.
  26. Shi, Y., Hao, K., 2013. Design and realization of chatting tool based on web. In CECNet'13, 3rd International Conference on Consumer Electronics, Communications and Networks, pp. 225-228. IEEE.
  27. Song, R., Liu, H., Xia, T., 2008. Study on signatures of P2P protocols based on regular expressions. In ICALIP08, Audio, Language and Image Processing, pp. 863-867. IEEE.
  28. Witten, I., Frank, E., 2005. Data Mining, practical Machine Learning Tools and Techniques. Elsevier, 2nd edition.
  29. Xusheng, Z., 2008. A P2P Traffic Classification Method Based on SVM. In ISCSCT'08, International Symposium on Computer Science and Comutational Technology, vol. 2, pp.53-57. IEEE.
  30. Zeidan, A., Lehmann, A., Trick, U., 2014. WebRTC enabled multimedia conferencing and collaboration solution. In WTC'14, World Telecommunications Congress, pp. 1-6. IEEE.
  31. Zhang, J., Zulkemine, M., Haque, A., 2008. Randomforests-based network intrusion detection systems. In IEEE Transactions Systems, Man, Cybernetics C, Applications and Reviews, volume 38, no. 5, pp. 649- 659.IEEE.
Download


Paper Citation


in Harvard Style

Di Mauro M. and Longo M. (2015). Revealing Encrypted WebRTC Traffic via Machine Learning Tools . In Proceedings of the 12th International Conference on Security and Cryptography - Volume 1: SECRYPT, (ICETE 2015) ISBN 978-989-758-117-5, pages 259-266. DOI: 10.5220/0005542202590266


in Bibtex Style

@conference{secrypt15,
author={Mario Di Mauro and Maurizio Longo},
title={Revealing Encrypted WebRTC Traffic via Machine Learning Tools},
booktitle={Proceedings of the 12th International Conference on Security and Cryptography - Volume 1: SECRYPT, (ICETE 2015)},
year={2015},
pages={259-266},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005542202590266},
isbn={978-989-758-117-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Conference on Security and Cryptography - Volume 1: SECRYPT, (ICETE 2015)
TI - Revealing Encrypted WebRTC Traffic via Machine Learning Tools
SN - 978-989-758-117-5
AU - Di Mauro M.
AU - Longo M.
PY - 2015
SP - 259
EP - 266
DO - 10.5220/0005542202590266