GENERATING LITERATURE-BASED KNOWLEDGE DISCOVERIES IN LIFE SCIENCES USING RELATIONSHIP ASSOCIATIONS

Steven B. Kraines, Weisen Guo, Daisuke Hoshiyama, Haruo Mizutani, Toshihisa Takagi

2010

Abstract

The life sciences have been a pioneering discipline for the field of knowledge discovery, since the literature-based discoveries by Swanson three decades ago. Existing literature-based knowledge discovery techniques generally try to discover hitherto unknown associations of domain concepts based on associations that can be established from the literature. However, scientific facts are more often expressed as specific relationships between concepts and/or entities that have been established through scientific research. A pair of relationships that predicate the specific way in which one concept relates to another can be associated if one of the concepts from each relationship can be determined to be semantically equivalent; we call this a “relationship association”. Then, by making the same assumption of the transitivity of association used by Swanson and others, we can generate a hypothetical relationship association by combining two relationship associations that have been extracted from a knowledge base. Here we describe an algorithm for generating potential knowledge discoveries in the form of new relationship associations that are implied but not actually stated, and we test the algorithm against a corpus of almost 5000 relationship associations that we have extracted in previous work from 392 semantic graphs representing research articles from MEDLINE.

References

  1. Allen, J.F., 2001. In silico veritas - Data-mining and automated discovery: the truth is in there. EMBO Reports, 2, 542-544.
  2. Baader, F., Calvanese, D., McGuinness, D. L., Nardi, D.,Patel-Schneider, P.F., 2003. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, New York.
  3. Berners-Lee T., Hendler, J., 2001. Publishing on the Semantic Web. Nature, 410, 1023-1024.
  4. Bontcheva, K., Wilks, Y., 2004. Automatic Report Generation from Ontologies: The MIAKT Approach. In Proceedings of the 9th International Conference on Applications of Natural Language to Information Systems, pp. 324-335.
  5. Cafarella, M. J., Re, C., Suciu, D., Etzioni, O., 2007. Structured Querying of Web Text Data: A Technical Challenge. In Proceedings of CIDR2007.
  6. Ceol, A., Chatr-Aryamontri, A., Licata, L., Cesareni, G., 2008. Linking Entries in Protein Interaction Database to Structured Text: the FEBS Letters Experiment. FEBS letters, 582(8), 1171-1177.
  7. Erhardt, R. A-A., Schneider, R., Blaschke, C., 2006. Status of text-mining techniques applied to biomedical text. Drug Discovery Today, 11(7-8), 315-325.
  8. Gerstein, M., Seringhaus, M., Fields, S., 2007. Structured digital abstract makes text mining easy. Nature, 447, 142.
  9. Guo, W., Kraines, S. B., 2008. Explicit Scientific Knowledge Comparison Based on Semantic Description Matching. American Society for Information Science and Technology 2008 Annual Meeting, Columbus, Ohio.
  10. Guo, W., Kraines, S. B., 2009. Discovering Relationship Associations in Life Sciences Using Ontology and Inference, Proceedings of 1st International Conference on Knowledge Discovery and Information Retrieval 2009, Madeira, Portugal, pp. 10-17, 6-8 October, 2009.
  11. Guo, W., Kraines, S. B., 2010a. Extracting Relationship Associations from Semantic Graphs in Life Sciences. Communications in Computer and Information Science (CCIS), in press.
  12. Guo, W., Kraines, S. B., 2010b. Mining Relationship Associations from Knowledge about Failures using Ontology and Inference. 10th Industrial Conference on Data Mining ICDM 2010, Berlin, Germany, July 12- 14, Advances in Data Mining, Lecture Notes in Artificial Intelligence (LNAI), accepted.
  13. Hartley, J., Betts, L., 2007. The effects of spacing and titles on judgments of the effectiveness of structured abstracts. JASIST, 58(14), 2335-2340.
  14. Hristovski, D., Friedman, C., Rindflesch, T. C, Peterlin, B., 2006. Exploiting Semantic Relations for LiteratureBased Discovery. In AMIA Annu Symp Proc. 2006, pp. 349-353.
  15. Hunter, L., Cohen, K. B., 2006. Biomedical language processing: what's beyond PubMed? Mol Cell., 21, 589-94.
  16. Kraines, S., 2010. An Ontology-based System for Sharing Expert Knowledge in Life Sciences. Journal of Information Research, in review.
  17. Kraines, S., Guo, W., Kemper, B., Nakamura, Y., 2006. EKOSS: A Knowledge-User Centered Approach to Knowledge Sharing, Discovery, and Integration on the Semantic Web. The 5th International Semantic Web Conference, LNCS 4273, 833-846.
  18. Langley, P., 2000. The computational support of scientific discovery. International Journal of Human-Computer Studies, 53, 393-410.
  19. Natarajan, J., Berrar, D., Dubitzky, W., Hack, C., Zhang, Y., DeSesa, C., Van Brocklyn, J. R, Bremer, E. G, 2006. Text mining of full-text journal articles combined with gene expression analysis reveals a relationship between sphingosine-1-phosphate and invasiveness of a glioblastoma cell line. BMC Bioinformatics, 7, 373.
  20. Natarajan, J., Berrar, D., Hack, C. J., Dublitzky, W., 2005. Knowledge discovery in biology and biotechnology texts: A review of techniques, evaluation strategies, and applications. Critical Rev in Biotech, 25, 31-52.
  21. Obama, K., Kato, T., Hasegawa, S., Satoh, S., Nakamura, Y., Furukawa, Y., 2006. Overexpression of peptidylprolyl isomerase-like 1 is associated with the growth of colon cancer cells. Clinical cancer research : an official journal of the American Association for Cancer Research, 12: 70-6.
  22. O'donnell, M., Mellish, C., Oberlander, J., Knott, A., 2001. ILEX: an architecture for a dynamic hypertext generation system. Nat. Lang. Eng., 7(3) 225-250.
  23. Pico, A. R., Kelder, T., van Iersel, M. P., Hanspers, K., Conklin, B. R., Evelo, C., 2008. WikiPathways: Pathway Editing for the People. PLoS Biol, 6(6), e184+.
  24. Racunas, S. A., Shah, N. H., Albert, I., Fedoroff, N. V., 2004. HyBrow: a prototype system for computer-aided hypothesis evaluation. Biofinformatics, 20 (Suppl 1), i257-i264.
  25. Rinaldi, F., G. Schneider, K. Kaljurand, M. Hess, M. Romacker, 2006. An environment for relation mining over richly annotated corpora: the case of GENIA. BMC Bioinformatics, 7 (Suppl 3), S3.
  26. Seringhaus, M., Gerstein, M., 2008. Manually structured digital abstracts: a scaffold for automatic text mining. FEBS Lett, 582, 1170.
  27. Smalheiser, N. R., 2002. Informatics and hypothesisdriven research. EMBO Reports, 3, 702-702.
  28. Srinivasan, P., 2004. Text Mining: Generating Hypotheses From MEDLINE. JASIST, 55(5), 396-413.
  29. Swanson, D. R., 1986. Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30, 7-18.
  30. Swanson, D. R., 1990. Somatomedin C and Arginine: Implicit connections between mutually isolated literatures. Perspectives in Biology and Medicine, 33(2), 157-179.
  31. Swanson, D. R., Smalheiser, N. R., 1997. An interactive system for finding complementary literatures: a stimulus to scientific discovery. Artificial Intelligence, 91, 183-203.
  32. Taweel, A., Rector, A., Rogers, J., 2006. A collaborative biomedical research system, Journal of Universal Computer Science, 12, 80-98.
  33. Weeber, M., Kors, J. A., Mons, B., 2005. Online tools to support literature-based discovery in the life sciences. Briefings in Bioinformatics, 6(3), 277-286.
  34. Weikum, G., Kasneci, G., Ramanath, M., Suchanek, F., 2009. Database and Information-retrieval Methods for Knowledge Discovery. Communications of the ACM, 4, 56-64.
Download


Paper Citation


in Harvard Style

Kraines S., Guo W., Hoshiyama D., Mizutani H. and Takagi T. (2010). GENERATING LITERATURE-BASED KNOWLEDGE DISCOVERIES IN LIFE SCIENCES USING RELATIONSHIP ASSOCIATIONS . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 35-44. DOI: 10.5220/0003068100350044


in Bibtex Style

@conference{kdir10,
author={Steven B. Kraines and Weisen Guo and Daisuke Hoshiyama and Haruo Mizutani and Toshihisa Takagi},
title={GENERATING LITERATURE-BASED KNOWLEDGE DISCOVERIES IN LIFE SCIENCES USING RELATIONSHIP ASSOCIATIONS },
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={35-44},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003068100350044},
isbn={978-989-8425-28-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - GENERATING LITERATURE-BASED KNOWLEDGE DISCOVERIES IN LIFE SCIENCES USING RELATIONSHIP ASSOCIATIONS
SN - 978-989-8425-28-7
AU - Kraines S.
AU - Guo W.
AU - Hoshiyama D.
AU - Mizutani H.
AU - Takagi T.
PY - 2010
SP - 35
EP - 44
DO - 10.5220/0003068100350044