Authors:
Diana MacLean
1
and
Margo Seltzer
2
Affiliations:
1
Stanford University, United States
;
2
Harvard School of Engineering and Applied Sciences, United States
Keyword(s):
Data mining, Knowledge discovery, Vioxx, Myocardial infarction, UMLS, MetaMap.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Business Analytics
;
Cloud Computing
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
e-Health
;
Enterprise Information Systems
;
Health Information Systems
;
Information Systems Analysis and Specification
;
Knowledge Management
;
Ontologies and the Semantic Web
;
Platforms and Applications
;
Sensor Networks
;
Signal Processing
;
Society, e-Business and e-Government
;
Soft Computing
;
Web Information Systems and Technologies
Abstract:
As the prevalence of blogs, discussion forums, and online news services continues to grow, so too does the portion of this Web content that relates to health and medicine. We propose that everyday, medically-oriented Web content is a valuable and viable data source for medical hypothesis generation and testing, despite its being noisy. In this paper, we present a proof-of-concept system supporting this notion. We construct a corpus comprising news articles relating to the drugs Vioxx, Naproxen and Ibuprofen, that were published between 1998-2002. Using this corpus, we show that there was a significant link between Vioxx and the concept “Myocardial Infarction” well before the drug was withdrawn from the market in 2004. Indeed, within the Vioxx-related content, the concept ranks amongst the top 3.3% in terms of importance. When compared with the Naproxen and Ibuprofen control literatures, the term occurs significantly more frequently in the Vioxx-related content.