Privacy Risk Assessment of Textual Publications in Social Networks

David Sanchez, Alexandre Viejo


Recent studies have warned that, in Social Networks, users usually publish sensitive data that can be exploited by dishonest parties. Some mechanisms to preserve the privacy of the users of social networks have been proposed (i.e. controlling who can access to a certain published data); however, a still unsolved problem is the lack of proposals that enable the users to be aware of the sensitivity of the contents they publish. This situation is especially true in the case of unstructured textual publications (i.e., wall posts, tweets, etc.). These elements are considered to be particularly dangerous from the privacy point of view due to their dynamism and high informativeness. To tackle this problem, in this paper we present an automatic method to assess the sensitivity of the user’s textual publications according to her privacy requirements towards the other users in the social network. In this manner, users can have a clear picture of the privacy risks inherent to their publications and can take the appropriate countermeasures to mitigate them. The feasibility of the method is studied in a highly sensitive social network: PatientsLikeMe.


  1. Abril, D., Navarro-Arribas, G. & Torra, V. On the declassification of confidential documents. 8th International Conference on Modeling Decision for Artificial Intelligence, 2011. 235-246.
  2. Akcora, C. G., Carminati, B. & Ferrari, E. Privacy in Social Networks: How Risky is Your Social Graph? IEEE 28th International Conference on Data Engineering, 2012. 9-19.
  3. Becker, J. & Chen, H. Measuring Privacy Risk in Online Social Networks. Web 2.0 Security and Privacy Conference, 2009.
  4. Carminati, B., Ferrari, E. & Perego, A. 2009. Enforcing access control in Web-based social networks. ACM Transaction on Information and System Security, 13(1), pp 38.
  5. Cilibrasi, R. L. & Vitányi, P. M. B. 2006. The Google Similarity Distance. IEEE Transactions on Knowledge and Data Engineering, 19(3), pp 370-383.
  6. Consumer Reports National Research Center 2010. Annual state of the net survey 2010. Consumer Reports, 75(6), pp 1.
  7. Chow, R., Golle, P. & Staddon, J. Detecting Privacy Leacks Using Corpus-based Association Rules. 14th Conference on Knowledge Discovery and Data Mining, 2008. 893-901.
  8. D'Arcy, J. 2011. Combating cyber bullying and technology's downside. The Washington Post.
  9. Department of Health and Human Services. 2000. The health insurance portability and accountability act.
  10. Health Privacy Project. 2013. State Privacy Protections [Online]. Available:
  11. Liu, K. & Terzi, E. 2010. A Framework for Computing the Privacy Scores of Users in Online Social Networks. ACM Transactions on Knowledge Discovery from Data, 5(1), pp 30.
  12. Resnik, P. Using Information Content to Evalutate Semantic Similarity in a Taxonomy. 14th International Joint Conference on Artificial Intelligence, 1995. 448-453.
  13. Sánchez, D., Batet, M. & Isern, D. 2011. Ontology-based Information Content computation. Knowledge-Based Systems, 24(2), pp 297-303.
  14. Sánchez, D., Batet, M., Valls, A. & Gibert, K. 2010. Ontology-driven web-based semantic similarity. Journal of Intelligent Information Systems, 35(3), pp 383-413.
  15. Sánchez, D., Batet, M. & Viejo, A. 2013a. Automatic general-purpose sanitization of textual documents. IEEE Transactions on Information Forensics and Security, 8(6), pp 853-862.
  16. Sánchez, D., Batet, M. & Viejo, A. 2013b. Minimizing the disclosure risk of semantic correlations in document sanitization. Information Sciences, 249(1), pp 110- 123.
  17. Sánchez, D., Batet, M. & Viejo, A. 2014. Utilitypreserving sanitization of semantically correlated terms in textual documents. Information Sciences, 279(1), pp 77-93.
  18. Srivastava, A. & Geethakumari, G. Measuring Privacy Leaks in Online Social Networks. International Conference on Advances in Computing, Communications and Informatics, 2013.
  19. Talukder, N., Ouzzani, M., Elmagarmid, A. K., Elmeleegy, H. & Yakout, M. Privometer: Privacy protection in social networks. IEEE 26th International Conference on Data Engineering Workshops, 2010.
  20. The European Parliament and the Council of the EU. 1995. Data Protection Directive 95/46/EC [Online].
  21. Turney, P. D. Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL. 12th European Conference on Machine Learning, ECML 2001, 2001. 491-502.
  22. Wang, Y., Nepali, R. K. & Nikolai, J. Social network privacy measurement and simulation. International Conference on Computing, Networking and Communications, 2014. 802-806.
  23. Zhang, C., Sun, J., Zhu, X. & Fang, Y. 2010. Privacy and security for online social networks: Challenges and opportunities. IEEE Network, 24(4), pp 13-18.

Paper Citation

in Harvard Style

Sanchez D. and Viejo A. (2015). Privacy Risk Assessment of Textual Publications in Social Networks . In Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-073-4, pages 236-241. DOI: 10.5220/0005281202360241

in Bibtex Style

author={David Sanchez and Alexandre Viejo},
title={Privacy Risk Assessment of Textual Publications in Social Networks},
booktitle={Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},

in EndNote Style

JO - Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - Privacy Risk Assessment of Textual Publications in Social Networks
SN - 978-989-758-073-4
AU - Sanchez D.
AU - Viejo A.
PY - 2015
SP - 236
EP - 241
DO - 10.5220/0005281202360241