Mining for Adverse Drug Events on Twitter

Felipe Duval, Ernesto Caffarena, Oswaldo Cruz, Fabrício Silva


At the post-marketing phase when drugs are used by large populations and for long periods, unexpected adverse events may occur altering the risk-benefit relation of drugs, sometimes requiring a regulatory action. These events at the post-marketing phase require a significant increase in health care since they result in unnecessary damage, often fatal, to patients. Therefore, the early discovery of adverse events in the post-marketing phase is a primary goal of the health system, in particular for pharmacovigilance systems. The main purpose of this paper is to prove that Twitter can be used as a source to find new and already known adverse drug events. This proposal has a prominent social relevance, as it will help pharmacovigilance systems.


  1. [Accessed April 2014].
  2. Dataminr [Online]. Available: [Accessed June 2014].
  3. Datasift [Online]. Available: [Accessed June 2014].
  4. Denguetrends [Online]. Available: org/denguetrends/br/#BR [Accessed April 2014].
  5. Flutrends [Online]. Available: org/flutrends/br/#BR [Accessed April 2014].
  6. GNIP [Online]. Available: [Accessed June 2014].
  7. MedlinePlus [Online]. Available: medlineplus/connect [Accessed June 2014].
  8. Topsy [Online]. Available: [Accessed June 2014].
  9. Twitter REST API. [Online] Available from:
  10. U.S. National Library of Medicine / National Institutes of Health [Online]. Available: [Accessed June 2014].
  11. Aronson AR, editor Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proceedings of the AMIA Symposium; 2001: American Medical Informatics Association.
  12. Bate A, Lindquist M, Edwards I, Olsson S, Orre R, Lansner A, et al. A Bayesian neural network method for adverse drug reaction signal generation. European journal of clinical pharmacology. 1998;54(4):315-21.
  13. Bodenreider, O. 2004. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research, 32, D267-D270.
  14. DuMouchel W. Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. The American Statistician. 1999;53(3):177-90.
  15. DuMouchel W, Pregibon D, editors. Empirical bayes screening for multi-item associations. Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining; 2001: ACM.
  16. Evans S, Waller PC, Davis S. Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiology and drug safety. 2001;10(6):483-6.
  17. Fielding RT. Architectural styles and the design of network-based software architectures: University of California, Irvine; 2000.
  18. Fram DM, Almenoff JS, DuMouchel W, editors. Empirical Bayesian data mining for discovering patterns in post-marketing drug safety. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining; 2003: ACM.
  19. Friedman, C., Alderson, P. O., Austin, J. H., Cimino, J. J. & Johnson, S. B. 1994. A general natural-language text processor for clinical radiology. Journal of the American Medical Informatics Association, 1, 161- 174.
  20. Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S. & Brilliant, L. 2009. Detecting influenza epidemics using search engine query data. Nature, 457, 1012-1014.
  21. Lampos, V. & Cristianini, N. 2012. Nowcasting events from the social web with statistical learning. ACM Transactions on Intelligent Systems and Technology (TIST), 3, 72.
  22. Masse M. REST API design rulebook: " O'Reilly Media, Inc."; 2011
  23. Mendes, M., Pinheiro, R., Avelar, K., Teixeira, J. & Silva, G. 2008. História da farmacovigilância no Brasil. Rev Bras Farm, 89, 246-251.
  24. Norén GN, Bate A, Orre R, Edwards IR. Extending the methods used to screen the WHO drug safety database towards analysis of complex associations and improved accuracy for rare events. Statistics in medicine. 2006;25(21):3740-57.
  25. Rothman KJ, Lanes S, Sacks ST. The reporting odds ratio and its advantages over the proportional reporting ratio. Pharmacoepidemiology and drug safety. 2004;13(8):519-23.
  26. Signorini, A., Segre, A. M. & Polgreen, P. M. 2011. The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic. PloS one, 6, e19467.
  27. Venulet J, Ten Ham M. Methods for monitoring and documenting adverse drug reactions. International journal of clinical pharmacology and therapeutics. 1996;34(3):112.
  28. Wu Y, Denny JC, Rosenbloom ST, Miller RA, Giuse DA, Xu H, editors. A comparative study of current clinical natural language processing systems on handling abbreviations in discharge summaries. AMIA Annual Symposium Proceedings; 2012: American Medical Informatics Association.
  29. Zorych, I., Madigan, D., Ryan, P. & Bate, A. 2013. Disproportionality methods for pharmacovigilance in longitudinal observational databases. Statistical methods in medical research, 22, 39-56.

Paper Citation

in Harvard Style

Duval F., Caffarena E., Cruz O. and Silva F. (2014). Mining for Adverse Drug Events on Twitter . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014) ISBN 978-989-758-048-2, pages 354-359. DOI: 10.5220/0005135203540359

in Bibtex Style

author={Felipe Duval and Ernesto Caffarena and Oswaldo Cruz and Fabrício Silva},
title={Mining for Adverse Drug Events on Twitter},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014)},

in EndNote Style

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014)
TI - Mining for Adverse Drug Events on Twitter
SN - 978-989-758-048-2
AU - Duval F.
AU - Caffarena E.
AU - Cruz O.
AU - Silva F.
PY - 2014
SP - 354
EP - 359
DO - 10.5220/0005135203540359