itemsets, “Support” shows the selection of the item is
done depending on its support,” Greedy” shows the
selection of the item is done in trial and error,” All”
shows the whole sensitive itemset in sensitive
transaction is deleted rather than deleting a victim.
Last column shows the year of the related research.
When we analyse the existing heuristic
sanitization algorithms we see that i) they use
different heuristics targeting to reduce the execution
time, distance, information loss while maintaining
minimum hiding failure, ii) there are few heuristic
based approaches that focus on sanitization under
multiple support thresholds.
6 CONCLUSIONS
In the case of applying itemset mining on the shared
data of organizations, each party needs to hide its
sensitive knowledge before extracting global
knowledge for mutual benefit. In this study we focus
on privacy preserving itemset hiding under multiple
support thresholds. Our algorithm (PGBS) utilizes
pseudo graph data structure that is used to store the
given transactional database to prevent multiple scans
of the given database and allow effective sanitization
process. We validate execution time and side effect
performances of our algorithm, Pseudo Graph Based
Sanitization (PGBS) in contrast to two recent
algorithms on 4 real databases varying number of
sensitive itemsets and sensitive thresholds.
Experimental results show that PGBS is competitive
in terms of execution time and distance especially on
dense datasets amongst the other algorithms. For
future work, we want to propose dynamic version of
our algorithm that is able to sanitize the updated
databases.
ACKNOWLEDGEMENTS
This work is partially supported by the Scientific and
Technological Research Council of Turkey
(TUBITAK) under ARDEB 3501 Project No:
114E779
REFERENCES
Agrawal, R., Srikant, R., 1994. Fast algorithms for mining
association rules in large databases. In: 20th
International Conference on Very Large Databases, pp.
487-499.
Amiri, A., 2007. Dare to share: Protecting sensitive
knowledge with data sanitization. Decision Support
Systems 43(1), pp. 181-191.
Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M.,
Verykios, VS., 1999. Disclosure limitation of sensitive
rules. In: Workshop on Knowledge and Data
Engineering Exchange, pp. 45-52.
Ayav, T., Ergenç, B., 2015. Full Exact approach for itemset
hiding. International Journal of Data Warehousing and
Mining. 11(4).
Bayardo, R J., Agrawal, R., Gunopulos, D., 1999.
Constraint based rule mining on large, dense data sets.
Data Mining and Knowledge Discovery, vol. 4, pp. 217-
240.
Blake, CL., Merz, CJ., 1998. UCI Repository of Machine
Learning Databases. University of California, Irvine,
Dept. of Information and Computer Sciences.
Bodon, F., 2003. A fast APRIORI implementation.
Workshop Frequent Itemset Mining Implementations
(FIMI’03), vol. 90, pp. 56-65.
Boora, RK., Shukla, R., Misra, A., 2009. An improved
approach to high level privacy preserving itemset
mining. International Journal of Computer Science and
Information Security, 6(3), pp. 216-223.
Brijs, T., Swinnen, G., Vanhoof, K., Wets, G., 1999. Using
association rules for product assortment decisions: a
case study. In Knowledge Discovery and Data Mining,
pp. 254–260.
Cheng, P., Roddick, J.F., Chu, S.C..et al., 2016. Privacy
preservation through a greedy, distortion-based rule
hiding method. Applied Intelligence, pp. 44-295.
Gkoulalas-Divanis, A., Verykios, VS., 2006. An integer
programming approach for frequent itemset hiding.
ACM International Conference on Information and
Knowledge Management.
Gkoulalas-Divanis, A., Verykios, VS., 2008. A
parallelization framework for exact knowledge hiding
in transactional databases. IFIP International
Federation for Information Processing, vol. 278, pp.
349-363.
Gkoulalas-Divanis, A., Verykios, VS., 2009. Hiding
sensitive knowledge without side effects. Knowledge
and Information Systems, 20(3), pp. 263-299.
Gkoulalas-Divanis, A., Verykios VS., 2010. Association
rule hiding for data mining. Springer.
Guo, Y., 2007. Reconstruction-based association rule
hiding. SIGMOD Ph.D. Workshop on Innovative
Database Research.
Han J., Pei J., Yin, Y., 2000. Mining frequent patterns
without candidate generation. In ACM SIGMOD
International Conference on Management of Data, pp.
1-12.
Hong, T-P., Lin, C-W., Yang, K-T., Wang, S-L., 2013.
Using tf-idf to hide sensitive itemsets. Appl Intell,
38(4), pp. 502–510.
Keer, S., Singh, A., 2012. Hiding sensitive association rule
using clusters of sensitive association rule.
International Journal of Computer Science and
Network, 1(3).