Authors:
Aliya Nugumanova
1
;
Madina Mansurova
2
;
Ermek Alimzhanov
2
;
Dmitry Zyryanov
1
and
Kurmash Apayev
1
Affiliations:
1
D. Serikbayev East Kazakhstan State Technical University, Kazakhstan
;
2
al-Farabi Kazakh National University, Kazakhstan
Keyword(s):
Concept Map, Co-occurrence Analysis, Term-term Matrix, Term-document Matrix, Chi-squared Test.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Data Engineering
;
Enterprise Information Systems
;
Health Information Systems
;
Information Systems Analysis and Specification
;
Knowledge Engineering and Ontology Development
;
Knowledge Management
;
Knowledge-Based Systems
;
Ontologies and the Semantic Web
;
Ontology Engineering
;
Society, e-Business and e-Government
;
Symbolic Systems
;
Web Information Systems and Technologies
Abstract:
The aim of this work is demonstration of usefulness and efficiency of statistical methods of text processing for automatic construction of concept maps of the pre-determined domain. Statistical methods considered in this paper are based on the analysis of co-occurrence of terms in the domain documents. To perform such analysis, at the first step we construct a term-document frequency matrix on the basis of which we can estimate the correlation between terms and the designed domain. At the second step we go on from the term-document matrix to the term-term matrix that allows to estimate the correlation between pairs of terms. The use of such approach allows to define the links between concepts as links in pairs which have the highest values of correlation. At the third step, we have to summarize the obtained information identifying concepts as nodes and links as edges of a graph and construct a concept map as resulting graph.