IMPRECISE EMPIRICAL ONTOLOGY REFINEMENT - Application to Taxonomy Acquisition

Vít Novácěk

2007

Abstract

The significance of uncertainty representation has become obvious in the Semantic Web community recently. This paper presents new results of our research on uncertainty incorporation into ontologies created automatically by means of Human Language Technologies. The research is related to OLE (Ontology LEarning) – a project aimed at bottom-up generation and merging of ontologies. It utilises a proposal of expressive fuzzy knowledge representation framework called ANUIC (Adaptive Net of Universally Interrelated Concepts). We discuss our recent achievements in taxonomy acquisition and show how even simple application of the principles of ANUIC can improve the results of initial knowledge extraction methods.

References

  1. Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D. L., Patel-Schneider, P. F., and Stein, L. A. (2004). OWL Web Ontology Language Reference. Available at (February 2006): http://www.w3. org/TR/owl-ref/.
  2. Brewster, C., Alani, H., Dasmahapatra, S., and Wilks, Y. (2004). Data driven ontology evaluation. In Proceedings of LREC 2004.
  3. Buitelaar, P., Cimiano, P., and Magnini, B., editors (2005). Ontology Learning from Text: Methods, Evaluation and Applications. IOS Press.
  4. Cimiano, P. and Völker, J. (2005). Text2Onto - a framework for ontology learning and data-driven change discovery. In Proceedings of the NLDB 2005 Conference, pages 227-238. Springer-Verlag.
  5. Gomez-Perez, A., Fernandez-Lopez, M., and Corcho, O. (2004). Ontological Engineering. Advanced Information and Knowledge Processing. Springer-Verlag.
  6. Haase, P. and Völker, J. (2005). Ontology learning and reasoning - dealing with uncertainty and inconsistency. In da Costa, P. C. G., Laskey, K. B., Laskey, K. J., and Pool, M., editors, Proceedings of the Workshop on Uncertainty Reasoning for the Semantic Web (URSW), pages 45-55.
  7. Hearst, M. A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th conference on Computational linguistics, pages 539- 545, Morristown, NJ, USA. Association for Computational Linguistics.
  8. Hobbs, J. R. and Gordon, A. S. (2005). Toward a largescale formal theory of commonsense psychology for metacognition. In Proceedings of AAAI Spring Symposium on Metacognition in Computation, pages 49- 54, Stanford, CA. ACM.
  9. Kanungo, T., Mount, D., Netanyahu, N., Piatko, C., Silverman, R., and Wu, A. (2002). An efficient k-means clustering algorithm: analysis and implementation.
  10. Kokinov, B. and French, R. M. (2003). Computational models of analogy making. In Nadel, L., editor, Encyclopedia of Conginitve Science, volume 1, pages 113- 118. Nature Publishing Group, London.
  11. Novác?ek, V. and Smrz?, P. (2006). Empirical merging of ontologies - a proposal of universal uncertainty representation framework. In LNCS, volume 4011, pages 65-79. Springer-Verlag Berlin Heidelberg.
  12. Paritosh, P. K. (2006). The heuristic reasoning manifesto. In Proceedings of the 20th International Workshop on Qualitative Reasoning.
  13. Ryu, P.-M. and Choi, K.-S. (2005). An informationtheoretic approach to taxonomy extraction for ontology learning. In Buitelaar, P., Cimiano, P., and Magnini, B., editors, Ontology Learning from Text: Methods, Evaluation and Applications, pages 15-28. IOS Press.
  14. Sanchez, E., editor (2006). Fuzzy Logic and the Semantic Web. Capturing Intelligence. Elsevier.
  15. Sheth, A., Ramakrishnan, C., and Thomas, C. (2005). Semantics for the semantic web: The implicit, the formal and the powerful. International Journal on Semantic Web & Information Systems, 1(1):1-18.
  16. Staab, S. and Studer, R., editors (2004). Handbook on Ontologies. International Handbooks on Information Systems. Springer-Verlag.
  17. Stoilos, G., Stamou, G., Tzouvaras, V., Pan, J., and Horrocks, I. (2005). Fuzzy owl: Uncertainty and the semantic web. International Workshop of OWL: Experiences and Directions, Galway, 2005.
  18. Zhdanova, A. V., Krummenacher, R., Henke, J., and Fensel, D. (2005). Community-driven ontology management: Deri case study. In Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence, pages 73-79. IEEE Computer Society Press.
  19. 1: process the resources by the pattern-based method and produce a set of ontologies Sp
  20. 2: merge the ontologies in Sp into one ontology R
  21. 3: process the resources by the clustering-based method (Alg. 1 and Alg. 2) using R as a reference ontology in Alg. 2 and produce set of ontologies
  22. Require: r - number of optimisation repeats, value 5 was found to be sufficient
  23. Require: pickBal(di,V ) - abstract (due to simplicity of the description) function, which pops a subset S from set V ; S is characterised by these conditions: (1) all v ? S all the closest possible vectors to di, and (2) all the sets picked from V are balanced in size after a sequence of pickBal() applications that makes V empty
  24. 1: Minit ? random v ? V {* initial means *}
  25. 2: Vtmp ? V
  26. 3: repeat
  27. 4: c ? centroid(Minit )
  28. 5: v ? u such that dist(u, c) is maximal for u ? Vtmp
  29. 6: Minit ? Minit ? {v}
  30. 7: Vtmp ? Vtmp - {v}
  31. 8: until |Minit | < k
  32. 9: FACT ? {} {* empty map *}
  33. 10: Vtmp ? V
  34. 11: j ? 0
  35. 12: for di ? Minit do
  36. 13: Sbalanced ? pickBal(di,Vtmp)
  37. 14: j ? j + 1
  38. 15: FACT [ j] ? Sbalanced
  39. 16: end for
  40. 17: C ? 0/
  41. 18: for j ? FACT.keys() do
  42. 19: C ? C ? centroid(FACT [ j])
  43. 20: end for
  44. 21: V ECT 2SCORE ? {} {* empty map *}
  45. 22: for v ? V do
  46. 23: V ECT 2SCORE[v] ? {(c0, 0), . . . , (ck-1, k - 1)} such that {c0, . . . , ck-1} is a sequence of centroids from C ordered by the increasing distance from v
  47. 24: end for
  48. 25: CLUST ? 0/ {* clustering structure *}
  49. 26: S ? {} {* empty map *}
  50. 27: for j ? {1, . . . , r} do
  51. 28: Stmp ? random shuffle of V
  52. 29: initialize clustering c j with clusters given by pivotal centroids from C
  53. 30: sequentially process Stmp and assign each vector to the nearest available cluster from c j, keeping the clusters as balanced in size as possible
  54. 31: compute the score S[ j] for the obtained clustering by summing up the numbers pointed by respective centroids in V ECT 2SCORE for each vector in each cluster in c j
  55. 32: CLUST ? CLUST ? c j
  56. 33: end for
  57. 34: return cx ? CLUST with lowest score S[ j], s ? {1, . . . , r} associated
Download


Paper Citation


in Harvard Style

Novácěk V. (2007). IMPRECISE EMPIRICAL ONTOLOGY REFINEMENT - Application to Taxonomy Acquisition . In Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-972-8865-89-4, pages 31-38. DOI: 10.5220/0002391800310038


in Bibtex Style

@conference{iceis07,
author={Vít Novácěk},
title={IMPRECISE EMPIRICAL ONTOLOGY REFINEMENT - Application to Taxonomy Acquisition},
booktitle={Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2007},
pages={31-38},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002391800310038},
isbn={978-972-8865-89-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - IMPRECISE EMPIRICAL ONTOLOGY REFINEMENT - Application to Taxonomy Acquisition
SN - 978-972-8865-89-4
AU - Novácěk V.
PY - 2007
SP - 31
EP - 38
DO - 10.5220/0002391800310038