Cloud based Privacy Preserving Data Mining with Decision Tree

Echo P. Zhang, Yi-Jun He, Lucas C. K. Hui


Privacy Preserving Data Mining (PPDM) aims at performing data mining among multiple parties, and at the meantime, no single party suffers the threat of releasing private data to any others. Nowadays, cloud service becomes more and more popular. However, how to deal with privacy issues of cloud service is still developing. This paper is one of the first researches in cloud server based PPDM. We propose a novel protocol that the cloud server performs data mining in encrypted databases, and our solution can guarantee the privacy of each client. This scheme can protect client from malicious users. With aid of a hardware box, the scheme can also protect clients from untrusted cloud server. Another novel feature of this solution is that it works even when the database from different parties are overlapping.


  1. Bhatnagar, V., Zaman, E., Rajpal, Y., and Bhardwaj, M. (2010). Vistree: Generic decision tree inducer and visualizer. In DATABASES IN NETWORKED INFORMATION SYSTEMS. Springer-Verlag.
  2. Du, W. and Zhan, Z. (2002). Building decision tree classifier on private data. In Proceedings of the IEEE ICDM Workshop on Privacy.
  3. Fang, W., Yang, B., and Song, D. (2010). Preserving private knowledge in decision tree learning. In Journal of Computers. ACADEMY PUBLISHER.
  4. Fontaine, C. and Galand, F. (2007). A survey of homomorphic encryption for nonspecialists. In EURASIP Journal on Information Security. Hindawi Publishing Corp. New York, NY, United States.
  5. Fung, B. C. M., Wang, K., and Yu, P. S. (2005). Top-down specialization for information and privacy preservation. In ICDE 2005.
  6. Gentry, C. (2009). Fully homomorphic encryption using ideal lattices. In STOC 7809 Proceedings of the 41st annual ACM symposium on Theory of computing. ACM New York, NY, USA.
  7. Goethals, B., Laur, S., Lipmaa, H., and Mielikainen, T. (2004). On private scalar product computation for privacy-preserving data mining. In Information Security and Cryptology ICISC 2004, volume 3506/2005. Springer-Verlag.
  8. Jha, S., Kruger, L., and McDaniel, P. (2005). Privacy preserving clustering. In COMPUTER SECURITY ESORICS 2005. Springer-Verlag.
  9. Kantarcioglu, M. and Clifton, C. (2004). Privacypreserving distributed mining of association rules on horizontally partitioned data. In Knowledge and Data Engineering.
  10. Kantarcioglu, M. and Kardes, O. (2009). Privacypreserving data mining in the malicious model, volume 2. Springer-Verlag.
  11. Kantarcioglu, M., Nix, R., and Vaidya, J. (2009). An efficient approximate protocol for privacy-preserving association rule mining. In ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING.
  12. Lindell, Y. and Pinkas, B. (2000). Privacy preserving data mining. In ADVANCES IN CRYPTOLOGY CRYPTO 2000. Springer-Verlag.
  13. Paillier, P. (1999). Public-key cryptosystems based on composite degree residuosity classes. In Prof. of the EUROCRYPT'99 . Springer-Verlag.
  14. Pearson, S. (2009). Taking account of privacy when designing cloud computing services. In CLOUD 7809. ICSE Workshop.
  15. Quinlan, J. R. (1986). MACHINE LEARNING. SpringerVerlag.
  16. Rivest, R. L. L., Adleman, M., and Dertouzos, M. L. (1978). On data banks and privacy homomorphisms. In In Foundations of Secure Computation. ACADEMY PUBLISHER.
  17. Singh, M. D., Krishna, P. R., and Saxena, A. (2010). A cryptography based privacy preserving solution to mine cloud data. In COMPUTE 7810 Proceedings of the Third Annual ACM Bangalore Conference. ACM New York, NY, USA.
  18. Vaidya, J. and Clifton, C. (2005). Privacy-preserving decision trees over vertically partitioned data. In DATA AND APPLICATIONS SECURITY XIX. SpringerVerlag.
  19. Verykios, V. S., Bertino, E., Fovino, I. N., Provenza, L. P., Saygin, Y., and Theodoridis, Y. (2004). State-of-theart in privacy preserving data mining. In ACM SIGMOD Record, volume Vol. 33, No. 1. ACM New York, NY, USA.
  20. Wang, K., Yu, P. S., and Chakraborty, S. (2004). Bottomup generalization: a data mining solution to privacy protection. In ICDM'04 .
  21. Yang, Z., Zhong, S., and Wright, R. N. (2005). Privacypreserving classification of customer data without loss of accuracy. In Proceedings of the 5th SIAM International Conference on Data Mining.
  22. Yao, A. (1986). How to generate and exchange secrets. In Proceedings of the IEEE 27th Annual Symposium on Foundations of Computer Science.
  23. Zhan, J., Matwin, S., and Chang, L. (2005). Privacypreserving collaborative association rule mining. In 19th Annual IFIP WG 11.3 Working Conference on Data and Applications Security University of Connecticut. Springer-Verlag.

Paper Citation

in Harvard Style

P. Zhang E., He Y. and C. K. Hui L. (2012). Cloud based Privacy Preserving Data Mining with Decision Tree . In Proceedings of the International Conference on Data Technologies and Applications - Volume 1: DATA, ISBN 978-989-8565-18-1, pages 5-14. DOI: 10.5220/0003996200050014

in Bibtex Style

author={Echo P. Zhang and Yi-Jun He and Lucas C. K. Hui},
title={Cloud based Privacy Preserving Data Mining with Decision Tree},
booktitle={Proceedings of the International Conference on Data Technologies and Applications - Volume 1: DATA,},

in EndNote Style

JO - Proceedings of the International Conference on Data Technologies and Applications - Volume 1: DATA,
TI - Cloud based Privacy Preserving Data Mining with Decision Tree
SN - 978-989-8565-18-1
AU - P. Zhang E.
AU - He Y.
AU - C. K. Hui L.
PY - 2012
SP - 5
EP - 14
DO - 10.5220/0003996200050014