Authors:
Nikola Milosevic
1
;
Cassie Gregson
2
;
Robert Hernandez
2
and
Goran Nenadic
1
Affiliations:
1
University of Manchester, United Kingdom
;
2
AstraZeneca plc, United Kingdom
Keyword(s):
Text Mining, Table Mining, Information Extraction, Natural Language Processing, Clinical Trials.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Business Analytics
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Health Information Systems
;
Information Systems Analysis and Specification
;
Knowledge Management
;
Ontologies and the Semantic Web
;
Sensor Networks
;
Signal Processing
;
Society, e-Business and e-Government
;
Soft Computing
;
Software Systems in Medicine
;
Web Information Systems and Technologies
Abstract:
Current biomedical text mining efforts are mostly focused on extracting information from the body of research articles. However, tables contain important information such as key characteristics of clinical trials. Here, we examine the feasibility of information extraction from tables. We focus on extracting data about clinical trial participants. We propose a rule-based method that decomposes tables into cell level structures and then extracts information from these structures. Our method performed with a F-measure of 83.3% for extraction of number of patients, 83.7% for extraction of patient’s body mass index and 57.75% for patient’s weight. These results are promising and show that information extraction from tables in biomedical literature is feasible.