Text Mining Technologies for Database Curation

Fabio Rinaldi

2014

Abstract

Text mining technologies, coupled with advanced user interfaces, have a great potential in the life sciences, for example supporting the process of database curation. We present a system which has achieved competitive results in several community-organized evaluations of text mining technologies and we discuss how such technologies can be integrated in a curation workflow.

References

  1. Androutsopoulos, I. (2013). A challenge on large-scale biomedical semantic indexing and question answering. In BioNLP workshop (part of the ACL Conference).
  2. Arighi, C., Roberts, P., Agarwal, S., Bhattacharya, S., Cesareni, G., Chatr-aryamontri, A., Clematide, S., Gaudet, P., Giglio, M., Harrow, I., Huala, E., Krallinger, M., Leser, U., Li, D., Liu, F., Lu, Z., Maltais, L., Okazaki, N., Perfetto, L., Rinaldi, F., Saetre, R., Salgado, D., Srinivasan, P., Thomas, P., Toldo, L., Hirschman, L., and Wu, C. (2011). Biocreative iii interactive task: an overview. BMC Bioinformatics, 12(Suppl 8):S4.
  3. Aronson, A. R. and Lang, F. M. (2010). An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc, 17(3):229-236.
  4. Bairoch, A. (2009). The future of annotation/biocuration. Nature Precedings.
  5. Campos, D., Matos, S., and Oliveira, J. L. (2013). Gimli: open source and high-performance biomedical name recognition. BMC Bioinformatics, 14:54.
  6. Cohen, K. B., Demner-Fushman, D., Ananiadou, S., Pestian, J., Tsujii, J., and Webber, B., editors (2009). Proceedings of the BioNLP 2009 Workshop. Association for Computational Linguistics, Boulder, Colorado.
  7. Davis, A., King, B., Mockus, S., Murphy, C., SaraceniRichards, C., Rosenstein, M., Wiegers, T., and Mattingly, C. (2011). The comparative toxicogenomics database: update 2011. Nucleic Acids Res., 39(Database issue):D1067-72.
  8. Gama-Castro, S., Rinaldi, F., Lpez-Fuentes, A., BalderasMartnez, Y. I., Clematide, S., Ellendorff, T. R., and Collado-Vides, J. (2013). Assisted curation of growth conditions that affect gene expression in e. coli k-12. In Proceedings of the Fourth BioCreative Challenge Evaluation Workshop, volume 1, pages 214-218.
  9. Gama-Castro, S., Rinaldi, F., Lpez-Fuentes, A., BalderasMartnez, Y. I., Clematide, S., Ellendorff, T. R., Santos-Zavaleta, A., Marques-Madeira, H., and Collado-Vides, J. (2014). Assisted curation of regulatory interactions and growth conditions of OxyR in E. coli K-12. Database: The Journal of Biological Databases and Curation, bau049.
  10. , Hoffmann, R. and Valencia, A. (2004). A gene network for navigating the literature. Nature Genetics, 36:664.
  11. Jonquet, C., Shah, N. H., and Musen, M. A. (2009). The open biomedical annotator. Summit on Translat Bioinforma, 2009:56-60.
  12. Kim, J., Pezik, P., and Rebholz-Schuhmann, D. (2008). Medevi: Retrieving textual evidence of relations between biomedical concepts from medline. Bioinformatics, 24(11):1410-1412.
  13. Kim, J., Pyysalo, S., Ohta, T., Bossy, R., Nguyen, N., and Tsujii, J. (2011). Overview of bionlp shared task 2011. ACL HLT 2011, page 1.
  14. Rebholz-Schuhmann, D., Arregui, M., Gaudan, S., Kirsch, H., and Jimeno, A. (2008). Text processing through Web services: calling Whatizit. Bioinformatics, 24(2):296-298.
  15. Rinaldi, F., Clematide, S., and Hafner, S. (2012b). Ranking of ctd articles and interactions using the ontogene pipeline. In Proceedings of the 2012 BioCreative workshop, Washington D.C.
  16. Rinaldi, F., Clematide, S., Hafner, S., Schneider, G., Grigonyte, G., Romacker, M., and Vachon, T. (2013a). Using the OntoGene pipeline for the triage task of BioCreative 2012. The Journal of Biological Databases and Curation, Oxford Journals.
  17. Rinaldi, F., Gama-Castro, S., Lpez-Fuentes, A., BalderasMartnez, Y., and Collado-Vides, J. (2013b). Digital curation experiments for regulondb. In BioCuration 2013, April 10th, Cambridge, UK.
  18. Rinaldi, F., Kaljurand, K., and Saetre, R. (2011). Terminological resources for text mining over biomedical scientific literature. Journal of Artificial Intelligence in Medicine, 52(2):107-114.
  19. Rinaldi, F., Kappeler, T., Kaljurand, K., Schneider, G., Klenner, M., Clematide, S., Hess, M., von Allmen, J.-M., Parisot, P., Romacker, M., and Vachon, T. (2008). OntoGene in BioCreative II. Genome Biology, 9(Suppl 2):S13.
  20. Rinaldi, F., Schneider, G., Kaljurand, K., Clematide, S., Vachon, T., and Romacker, M. (2010). OntoGene in BioCreative II.5. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 7(3):472-480.
  21. Sangkuhl, K., Berlin, D. S., Altman, R. B., and Klein, T. E. (2008). PharmGKB: Understanding the effects of individual genetic variants. Drug Metabolism Reviews, 40(4):539-551. PMID: 18949600.
  22. Savova, G. K., Masanz, J. J., Ogren, P. V., Zheng, J., Sohn, S., Kipper-Schuler, K. C., and Chute, C. G. (2010). Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc, 17(5):507-513.
  23. Schneider, G., Kaljurand, K., Rinaldi, F., and Kuhn, T. (2007). Pro3Gres parser in the CoNLL domain adaptation shared task. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pages 1161-1165, Prague.
  24. Segura-Bedmar, I., Martnez, P., and Snchez-Cisneros, D. (2011). The 1st ddi extraction-2011 challenge task: Extraction of drug-drug interactions from biomedical texts. In Proc DDI Extraction-2011 challenge task, pages 1-9, Huelva, Spain.
  25. Sun, W., Rumshisky, A., and Uzuner, O. (2013). Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. J Am Med Inform Assoc, 20(5):806-813.
Download


Paper Citation


in Harvard Style

Rinaldi F. (2014). Text Mining Technologies for Database Curation . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: SSTM, (IC3K 2014) ISBN 978-989-758-048-2, pages 544-548. DOI: 10.5220/0005174905440548


in Bibtex Style

@conference{sstm14,
author={Fabio Rinaldi},
title={Text Mining Technologies for Database Curation},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: SSTM, (IC3K 2014)},
year={2014},
pages={544-548},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005174905440548},
isbn={978-989-758-048-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: SSTM, (IC3K 2014)
TI - Text Mining Technologies for Database Curation
SN - 978-989-758-048-2
AU - Rinaldi F.
PY - 2014
SP - 544
EP - 548
DO - 10.5220/0005174905440548