Computing Semantic Textual Similarity based on Partial Textual Entailment

Martin Víta

Abstract

With a rapidly growing amount of textual data, the task of discovering semantically similar documents is becoming more and more important. This general task is being solved in a wide range of practical applications ranging from recommender systems to plagiarism detection or duplication removal tools. The proposed doctoral project is focused on the application of a generalized concept of textual entailment to measure the semantic textual similarity. The notion of cross-lingual partial textual entailment will be introduced in order to deal with semantic textual similarity among multilingual documents.

References

  1. Agirre, E., Banea, C., et al. (2015). Semeval-2015 task 2: Semantic textual similarity, english, s-panish and pilot on interpretability. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), June.
  2. Androutsopoulos, I. and Malakasiotis, P. (2010). A survey of paraphrasing and textual entailment methods. Journal of Artificial Intelligence Research, pages 135-187.
  3. Bentivogli, L., Clark, P., Dagan, I., Dang, H., and Giampiccolo, D. (2011). The seventh pascal recognizing textual entailment challenge. Proceedings of TAC, 2011.
  4. Bjerva, J., Bos, J., van der Goot, R., and Nissim, M. (2014). The meaning factory: Formal semantics for recognizing textual entailment and determining semantic similarity. SemEval 2014, page 642.
  5. Burrows, S., Gurevych, I., and Stein, B. (2015). The eras and trends of automatic short answer grading. International Journal of Artificial Intelligence in Education, 25(1):60-117.
  6. Clark, Fellbaum, H. (2006). The Boeing-PrincetonISI (BPI) textual entailment test suite. http://www.cs.utexas.edu/ pclark/bpi-test-suite/.
  7. de Salvo Braz, R., Girju, R., Punyakanok, V., Roth, D., and Sammons, M. (2006). An inference model for semantic entailment in natural language. In Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment, pages 261-286. Springer.
  8. Dzikovska, M. O., Nielsen, R. D., and Brew, C. (2012). Towards effective tutorial feedback for explanation questions: A dataset and baselines. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 200-210. Association for Computational Linguistics.
  9. Dzikovska, M. O., Nielsen, R. D., Brew, C., Leacock, C., Giampiccolo, D., Bentivogli, L., Clark, P., Dagan, I., and Dang, H. T. (2013). Semeval-2013 task 7: The joint student response analysis and 8th recognizing textual entailment challenge. Technical report, DTIC Document.
  10. Erk, K. and Padó, S. (2009). Paraphrase assessment in structured vector space: Exploring parameters and datasets. In Proceedings of the Workshop on Geometrical Models of Natural Language Semantics, pages 57-65. Association for Computational Linguistics.
  11. Fellbaum, C. (1998). WordNet. Wiley Online Library.
  12. Gupta, A., Kaur, M., Singh, A., Goel, A., and Mirkin, S. (2014). Text summarization through entailment-based minimum vertex cover. Lexical and Computational Semantics (* SEM 2014), page 75.
  13. Harmeling, S. (2009). Inferring textual entailment with a probabilistically sound calculus. Natural Language Engineering, 15(04):459-477.
  14. Kouylekov, M. and Magnini, B. (2005). Recognizing textual entailment with tree edit distance. In Proceedings of the PASCAL RTE Challenge, pages 17-20.
  15. Kouylekov, M. and Magnini, B. (2006). Combining lexical resources with tree edit distance for recognizing textual entailment. In Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment, pages 217-230. Springer.
  16. Levy, O., Zesch, T., Dagan, I., and Gurevych, I. (2013). Recognizing partial textual entailment. In ACL (2), pages 451-455.
  17. Malakasiotis, P. and Androutsopoulos, I. (2007). Learning textual entailment using svms and string similarity measures. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pages 42-47. Association for Computational Linguistics.
  18. Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013a). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
  19. Mikolov, T., Le, Q. V., and Sutskever, I. (2013b). Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168.
  20. Mikolov, T., Yih, W.-t., and Zweig, G. (2013c). Linguistic regularities in continuous space word representations. In HLT-NAACL, pages 746-751.
  21. Min˜arro-Giménez, J. A., Marín-Alonso, O., and Samwald, M. (2015). Applying deep learning techniques on medical corpora from the world wide web: a prototypical system and evaluation. arXiv preprint arXiv:1502.03682.
  22. Moldovan, D. I. and Rus, V. (2001). Logic form transformation of wordnet and its applicability to question answering. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pages 402-409. Association for Computational Linguistics.
  23. Neve??rilová, Z. (2014a). Paraphrase and textual entailment generation. In Text, Speech and Dialogue, pages 293- 300. Springer.
  24. Neve??rilová, Z. (2014b). Paraphrase and Textual Entailment Generation in Czech [online]. PhD thesis, Faculty of Informatics, Masaryk University Brno.
  25. Nielsen, R. D., Ward, W., and Martin, J. H. (2009). Recognizing entailment in intelligent tutoring systems. Natural Language Engineering, 15(04):479-501.
  26. Nielsen, R. D., Ward, W., Martin, J. H., and Palmer, M. (2008). Annotating students understanding of science concepts. In In Proc. LREC.
  27. Rehurek, R. (2008). Semantic-based plagiarism detection [online]. Ph.d. thesis proposal, Faculty of Informatics, Masaryk University Brno.
  28. Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. arXiv preprint cmplg/9511007.
  29. Rudrapal, D. and Bhattacharya, B. (2014). Recognition of partial textual entailment for bengali tweets. SocialIndia 2014, 2014:29.
  30. Stern, A. and Dagan, I. (2012). Biutee: A modular opensource system for recognizing textual entailment. In Proceedings of the ACL 2012 System Demonstrations, pages 73-78. Association for Computational Linguistics.
  31. Tatu, M. and Moldovan, D. (2006). A logic-based semantic approach to recognizing textual entailment. In Proceedings of the COLING/ACL on Main conference poster sessions, pages 819-826. Association for Computational Linguistics.
  32. Tian, R., Miyao, Y., and Matsuzaki, T. (2014). Logical inference on dependency-based compositional semantics. In Proceedings of ACL, pages 79-89.
  33. Williams, D. R. G. H. R. and Hinton, G. (1986). Learning representations by back-propagating errors. Nature, pages 523-533.
  34. Xu, W., Callison-Burch, C., and Dolan, W. B. (2015). Semeval-2015 task 1: Paraphrase and semantic similarity in twitter (pit). In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval).
  35. Zanzotto, F., Pennacchiotti, M., and Moschitti, A. (2009). A machine learning approach to textual entailment recognition. Natural Language Engineering, 15(04):551-582.
Download


Paper Citation


in Harvard Style

Víta M. (2015). Computing Semantic Textual Similarity based on Partial Textual Entailment . In Doctoral Consortium - DC3K, (IC3K 2015) ISBN , pages 3-12. DOI: 10.5220/0005647600030012


in Bibtex Style

@conference{dc3k15,
author={Martin Víta},
title={Computing Semantic Textual Similarity based on Partial Textual Entailment},
booktitle={Doctoral Consortium - DC3K, (IC3K 2015)},
year={2015},
pages={3-12},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005647600030012},
isbn={},
}


in EndNote Style

TY - CONF
JO - Doctoral Consortium - DC3K, (IC3K 2015)
TI - Computing Semantic Textual Similarity based on Partial Textual Entailment
SN -
AU - Víta M.
PY - 2015
SP - 3
EP - 12
DO - 10.5220/0005647600030012