Identifying Most Relevant Concepts to Describe Clinical Trial Eligibility Criteria
Krystyna Milian, Anca Bucur, Frank van Harmelen, Annette ten Teije
2013
Abstract
Since eligibility criteria of clinical trials are represented as free text, their automatic interpretation and the evaluation of patient eligibility is challenging. Our approach to the criteria processing is based on the identification of contextual patterns and semantic concepts that together define the machine-interpretable meaning. The goal of this research is to find the most relevant concepts occurring in eligibility criteria that need to be mapped to patient record to enable automatic evaluation of patient eligibility. Based on the analysis of annotation of breast cancer trials obtained using different concept recognizers and ontologies from UMLS Thesaurus, we chose to use MetaMap and SNOMED CT to create the mapping set. To prioritize the identified concepts, we used the tf-idf measure and the corpus of over 38, 000 various clinical trials, to detect concepts specific for breast cancer, and cancer in general. The obtained results can guide the mapping order of criteria concepts to patient data. The observed substantial overlap between the terms occurring in criteria from the trials related to breast cancer and other diseases will reduce the cost of extending the trial matching system to other diseases.
References
- Aronson, A. R. and Lang, F.-M. (2010). An overview of metamap: historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3):229-236.
- Clark, K. and Parsia, B. (2008). Modularity and owl. Literature survey.
- Jones, K. S. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28:11-21.
- Milian, K., Aleksovski, Z., Vdovjak, R., ten Teije, A., and van Harmelen, F. (2009). Identifying diseasecentric subdomains in very large medical ontologies, a case-study on breast-cancer concepts in snomed. In Knowledge Representation for Healthcare (KR4HC09), LNCS.
- Milian, K., Bucur, A., and ten Teije, A. (2012). Formalization of clinical trial eligibility criteria: Evaluation of a pattern-based approach. In 2012 IEEE International Conference on Bioinformatics and Biomedicine.
- Musen, M., Shah, N., Noy, N., Dai, B., Dorf, M., Griffith, N. B., Buntrock, J., Jonquet, C., Montegut, M., and Rubin, D. (2008). Bioportal: Ontologies and data resources with the click of a mouse. In AMIA Annual Symposium, pages 1223-1224.
- Ross, J., Tu, S. W., Carini, S., and Sim, I. (2010). Analysis of eligibility criteria complexity in clinical trials. AMIA Summits on Translational Science Proceedings, pages 46-50.
- Shah, N. H., Bhatia, N., Jonquet, C., Rubin, D. L., Chiang, A. P., and Musen, M. A. (2009). Comparison of concept recognizers for building the open biomedical annotator. BMC Bioinformatics, 10(S-9):14.
Paper Citation
in Harvard Style
Milian K., Bucur A., van Harmelen F. and ten Teije A. (2013). Identifying Most Relevant Concepts to Describe Clinical Trial Eligibility Criteria . In Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2013) ISBN 978-989-8565-37-2, pages 161-166. DOI: 10.5220/0004192501610166
in Bibtex Style
@conference{healthinf13,
author={Krystyna Milian and Anca Bucur and Frank van Harmelen and Annette ten Teije},
title={Identifying Most Relevant Concepts to Describe Clinical Trial Eligibility Criteria},
booktitle={Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2013)},
year={2013},
pages={161-166},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004192501610166},
isbn={978-989-8565-37-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2013)
TI - Identifying Most Relevant Concepts to Describe Clinical Trial Eligibility Criteria
SN - 978-989-8565-37-2
AU - Milian K.
AU - Bucur A.
AU - van Harmelen F.
AU - ten Teije A.
PY - 2013
SP - 161
EP - 166
DO - 10.5220/0004192501610166