Authors:
Matthias Pfaff
1
and
Helmut Krcmar
2
Affiliations:
1
fortiss GmbH An-Institut Technische Universität München, Germany
;
2
Technische Universität München, Germany
Keyword(s):
IT Benchmarking, Natural Language Processing, Heterogeneous Data, Semantic Data Integration, Ontologies.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Artificial Intelligence and Decision Support Systems
;
Coupling and Integrating Heterogeneous Data Sources
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Information Systems Analysis and Specification
;
Knowledge Engineering and Ontology Development
;
Knowledge-Based Systems
;
Natural Language Interfaces to Intelligent Systems
;
Ontologies and the Semantic Web
;
Ontology Engineering
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Symbolic Systems
Abstract:
In the domain of IT benchmarking collected data are often stored in natural language text and therefore intrinsically
unstructured. To ease data analysis and data evaluations across different types of IT benchmarking
approaches a semantic representation of this information is crucial. Thus, the identification of conceptual (semantical)
similarities is the first step in the development of an integrative data management in this domain. As
an ontology is a specification of such a conceptualization an association of terms, relations between terms and
related instances must be developed. Building on previous research we present an approach for an automated
term extraction by the use of natural language processing (NLP) techniques. Terms are automatically extracted
out of existing IT benchmarking documents leading to a domain specific dictionary. These extracted terms are
representative for each document and describe the purpose and content of each file and server as a basis for
the ontolo
gy development process in the domain of IT benchmarking.
(More)