User Feedback Analysis for Mobile Malware Detection

Tal Hadad, Bronislav Sidik, Nir Ofek, Rami Puzis, Lior Rokach


With the increasing number of smartphone users, mobile malware has become a serious threat. Similar to the best practice on personal computers, the users are encouraged to install anti-virus and intrusion detection software on their mobile devices. Nevertheless, their devises are far from being fully protected. Major mobile application distributors, designated stores and marketplaces, inspect the uploaded application with state of the art malware detection tools and remove applications that turned to be malicious. Unfortunately, many malicious applications have a large window of opportunity until they are removed from the marketplace. Meanwhile users install the applications, use them, and leave comments in the respective marketplaces. Occasionally such comments trigger the interest of malware laboratories in inspecting a particular application and thus, speedup its removal from the marketplaces. In this paper, we present a new approach for mining user comments in mobile application marketplaces with a purpose of detecting malicious apps. Two computationally efficient features are suggested and evaluated using data collected from the "Amazon Appstore". Using these two features, we show that feedback generated by the crowd is effective for detecting malicious applications without the need for downloading them.


  1. Amazon (2016). Amazon appstore. http:// [Online; accessed April 2016].
  2. Aung, Z. and Zaw, W. (2013). Permission-based android malware detection. International Journal of Scientific and Technology Research, 2(3):228-234.
  3. Balahur, A., Steinberger, R., Kabadjov, M., Zavarella, V., Van Der Goot, E., Halkia, M., Pouliquen, B., and Belyaeva, J. (2013). Sentiment analysis in the news. arXiv preprint arXiv:1309.6202.
  4. Baron, N. S. (2003). Language of the internet. The Stanford handbook for language engineers, pages 59-127.
  5. Bishop, C. M. (2006). Pattern recognition. Machine Learning, 128.
  6. Blair-Goldensohn, S., Hannan, K., McDonald, R., Neylon, T., Reis, G. A., and Reynar, J. (2008). Building a sentiment summarizer for local service reviews. In WWW De Marneffe, M.-C., MacCartney, B., Manning, C. D., et al. (2006). Generating typed dependency parses from phrase structure parses. In Proceedings of LREC, volume 6, pages 449-454.
  7. Dunham, K. (2008). Mobile malware attacks and defense. Syngress.
  8. Harris, Z. S. (1954). Distributional structure. Word, 10(2- 3):146-162.
  9. Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE transactions on pattern analysis and machine intelligence, 20(8):832- 844.
  10. Hong, L. and Davison, B. D. (2010). Empirical study of topic modeling in twitter. In Proceedings of the First Workshop on Social Media Analytics, pages 80-88. ACM.
  11. Hu, M. and Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168-177. ACM.
  12. IDC (2016). Apple, huawei, and xiaomi finish 2015 with above average year-over-year growth, as worldwide smartphone shipments surpass 1.4 billion for the year, according to idc. prUS40980416.
  13. Katz, G., Ofek, N., and Shapira, B. (2015). Consent: Context-based sentiment analysis. Knowledge-Based Systems, 84:162-178.
  14. Kohavi, R. et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In IJCAI, volume 14, pages 1137-1145.
  15. Moser, A., Kruegel, C., and Kirda, E. (2007). Limits of static analysis for malware detection. In Computer Security Applications Conference, 2007. ACSAC 2007. Twenty-Third Annual, pages 421-430. IEEE.
  16. Mullen, T. and Collier, N. (2004). Sentiment analysis using support vector machines with diverse information sources. In EMNLP, volume 4, pages 412-418.
  17. Ofek, N., Katz, G., Shapira, B., and Bar-Zev, Y. (2015). Sentiment analysis in transcribed utterances. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 27-38. Springer.
  18. Ofek, N., Poria, S., Rokach, L., Cambria, E., Hussain, A., and Shabtai, A. (2016). Unsupervised commonsense knowledge enrichment for domain-specific sentiment analysis. Cognitive Computation, 8(3):467-477.
  19. Ofek, N., Rokach, L., and Mitra, P. (2014). Methodology for connecting nouns to their modifying adjectives. In International Conference on Intelligent Text Processing and Computational Linguistics, pages 271-284. Springer.
  20. Ofek, N. and Shabtai, A. (2014). Dynamic latent expertise mining in social networks. IEEE Internet Computing, 18(5):20-27.
  21. (2016). Onix text retrieval toolkit.
  23. [Online; accessed April 2016].
  24. Oommen, T., Baise, L. G., and Vogel, R. M. (2011). Sampling bias and class imbalance in maximumlikelihood logistic regression. Mathematical Geosciences, 43(1):99-120.
  25. Pang, B., Lee, L., and Vaithyanathan, S. (2002). Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, pages 79-86. Association for Computational Linguistics.
  26. Portier, K., Greer, G. E., Rokach, L., Ofek, N., Wang, Y., Biyani, P., Yu, M., Banerjee, S., Zhao, K., Mitra, P., et al. (2013). Understanding topics and sentiment in an online cancer survivor community. JNCI Monographs, 47:195-198.
  27. Portokalidis, G., Homburg, P., Anagnostakis, K., and Bos, H. (2010). Paranoid android: versatile protection for smartphones. In Proceedings of the 26th Annual Computer Security Applications Conference, pages 347- 356. ACM.
  28. Quinlan, J. R. (1993). C4. 5: Programming for machine learning. Morgan Kauffmann, page 38.
  29. Ranveer, S. and Hiray, S. (2015). Comparative analysis of feature extraction methods of malware detection. International Journal of Computer Applications, 120(5).
  30. Rastogi, V., Chen, Y., and Jiang, X. (2013). Droidchameleon: evaluating android anti-malware against transformation attacks. In Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security, pages 329-334. ACM.
  31. Shabtai, A., Tenenboim-Chekina, L., Mimran, D., Rokach, L., Shapira, B., and Elovici, Y. (2014). Mobile malware detection through analysis of deviations in application network behavior. Computers & Security, 43:1-18.
  32. Singh, Y., Kaur, A., and Malhotra, R. (2009). Comparative analysis of regression and machine learning methods for predicting fault proneness models. International journal of computer applications in technology, 35(2- 4):183-193.
  33. snoopwall (2014). Summarized privacy and risk analysis of top 10 android flashlight apps. 2014/. [Online; accessed April 2016].
  34. Statista (2016). Number of available apps in the apple app store from july 2008 to june 2015. [Online; accessed April 2016].
  35. Total, V. (2016). Virustotal, a free online service that analyzes files and urls enabling the identification of viruses, worms, trojans and other kinds of malicious content. [Online; accessed April 2016].
  36. Twitter (2016). Twitter dictionary: A guide to understanding twitter lingo. ref/Twitter Dictionaryx Guide.asp. [Online; accessed April 2016].
  37. S?rndic, N. and Laskov, P. (2013). Detection of malicious pdf files based on hierarchical document structure. In Proceedings of the 20th Annual Network & Distributed System Security Symposium.
  38. Walker, S. H. and Duncan, D. B. (1967). Estimation of the probability of an event as a function of several independent variables. Biometrika, 54(1-2):167-179.
  39. Wang, K. and Stolfo, S. J. (2004). Anomalous payloadbased network intrusion detection. In International Workshop on Recent Advances in Intrusion Detection, pages 203-222. Springer.
  40. WEKA (2016). [Online; accessed April 2016].
  41. Xie, L., Zhang, X., Seifert, J.-P., and Zhu, S. (2010). pbmds: a behavior-based malware detection system for cellphone devices. In Proceedings of the third ACM conference on Wireless network security, pages 37- 48. ACM.
  42. Yang, Z., Yang, M., Zhang, Y., Gu, G., Ning, P., and Wang, X. S. (2013). Appintent: Analyzing sensitive data transmission in android for privacy leakage detection. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 1043-1054. ACM.
  43. Ye, Q., Zhang, Z., and Law, R. (2009). Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems with Applications, 36(3):6527-6535.
  44. Zhang, Y., Yang, M., Xu, B., Yang, Z., Gu, G., Ning, P., Wang, X. S., and Zang, B. (2013). Vetting undesirable behaviors in android apps with permission use analysis. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 611-622. ACM.
  45. Zheng, M., Sun, M., and Lui, J. C. (2013). Droid analytics: a signature based analytic system to collect, extract, analyze and associate android malware. In 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, pages 163-171. IEEE.

Paper Citation

in Harvard Style

Hadad T., Sidik B., Ofek N., Puzis R. and Rokach L. (2017). User Feedback Analysis for Mobile Malware Detection . In Proceedings of the 3rd International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-209-7, pages 83-94. DOI: 10.5220/0006131200830094

in Bibtex Style

author={Tal Hadad and Bronislav Sidik and Nir Ofek and Rami Puzis and Lior Rokach},
title={User Feedback Analysis for Mobile Malware Detection},
booktitle={Proceedings of the 3rd International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},

in EndNote Style

JO - Proceedings of the 3rd International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - User Feedback Analysis for Mobile Malware Detection
SN - 978-989-758-209-7
AU - Hadad T.
AU - Sidik B.
AU - Ofek N.
AU - Puzis R.
AU - Rokach L.
PY - 2017
SP - 83
EP - 94
DO - 10.5220/0006131200830094