automating the term extrusion out of benchmarking
documents the development of this ontology is accel-
erated. This acceleration is even more important on
maintaining an ontology. As the initial development
of such an ontology is only the first step, extension
and maintenance processes are activities which also
get supported by the automated term extrusion. This
is especially useful if new domain specific terms need
to be identified out of new documents, such as service
descriptions (e.g. related to topics like cloud comput-
ing).
Future work will focus on step two/three, shown
in Figure 1. As it is shown, the conceptualization of
terms leads, in general, to a cyclically adjustment of
the initial developed ontology. As this process needs
to be supervised by a domain expert only a semi- au-
tomation of this step is possible yet. Nevertheless
this semi-automation will be developed. To support
the domain expert during this step, the differences
between two ontology versions (before and after the
automatic term extrusion) will be identified and pre-
sented to him. Moreover this kind of versioning helps
to comprehend the development process of the whole
ontology.
In a last step, already existing output data will
be linked to the domain ontology, such as, cost or
performance values collected from different compa-
nies since the last seven years and persisted in various
databases (eg. MySQL or Access DB). Thus, the con-
ceptualization of logical structures in this domain, is
used to get access to benchmarking data. Without the
need of the developmentof a unified database schema.
Therefore new databases can be linked to already ex-
isting ones by the use of an abstraction layer, so called
ontology.
REFERENCES
Alatrish, E. S., Tosic, D., and Milenkovic, N. (2014). Build-
ing ontologies for different natural languages. Com-
put. Sci. Inf. Syst., 11(2):623–644.
Bird, S., Klein, E., and Loper, E. (2009). Natural Language
Processing with Python: Analyzing Text with the Nat-
ural Language Toolkit. O’Reilly, Beijing.
Brewster, C. and O’Hara, K. (2007). Knowledge repre-
sentation with ontologies: Present challenges - fu-
ture possibilities. International Journal of Human-
Computer Studies, 65(7):563–568.
Cambria, E., Hussain, A., and Eckl, C. (2011). Bridging the
gap between structured and unstructured health-care
data through semantics and sentics. In Proceedings of
ACM.
Camp, R. (1989). Benchmarking: The search for indus-
try best practices that lead to superior performance.
Quality Press, Milwaukee, Wis.
Camp, R. (1995). Business process benchmarking : find-
ing and implementing best practices. ASQC Quality
Press, Milwaukee, Wis.
Chandrasekaran, B., Josephson, J. R., and Benjamins, V. R.
(1999). What are ontologies, and why do we need
them? IEEE Intelligent Systems, 14(1):20–26.
Fernandez-Lopez, M., Gomez-Perez, A., and Juristo, N.
(1997). Methontology: from ontological art to-
wards ontological engineering. In Proceedings of the
AAAI97 Spring Symposium, pages 33–40.
Foundation, A. S. (2014). Apache poi api. http://poi.
apache.org.
Gacenga, F., Cater-Steel, A., Tan, W., and Toleman, M.
(2011). It service management: towards a contingency
theory of performance measurement. In International
Conference on Information Systems, pages 1–18.
Guarino, N. (1995). Formal ontology, conceptual analysis
and knowledge representation. International Journal
of Human-Computer Studies, 43(5-6):625–640.
Horkoff, J., Borgida, A., Mylopoulos, J., Barone, D., Jiang,
L., Yu, E., and Amyot, D. (2012). Making Data Mean-
ingful: The Business Intelligence Model and Its For-
mal Semantics in Description Logics, volume 7566 of
Lecture Notes in Computer Science, book section 17,
pages 700–717. Springer Berlin Heidelberg.
Jakob, M., Pfaff, M., and Reidt, A. (2013). A literature
review of research on it benchmarking. In Krcmar, H.,
Goswami, S., Schermann, M., Wittges, H., and Wolf,
P., editors, 11th Workshop on Information Systems and
Service Sciences, volume 25.
Jurˇsic, M., Mozetic, I., Erjavec, T., and Lavrac, N. (2010).
Lemmagen: Multilingual lemmatisation with induced
ripple-down rules. Journal of Universal Computer
Science, 16(9):1190–1214.
Karanikolas, N. N. and Skourlas, C. (2010). A parametric
methodology for text classification. Journal of Infor-
mation Science, 36(4):421–442.
K¨utz, M. (2006). IT-Steuerung mit Kennzahlensystemen.
dpunkt.verlag, Heidelberg.
Lame, G. (2005). Using nlp techniques to identify legal on-
tology components: Concepts and relations. In Ben-
jamins, V., Casanovas, P., Breuker, J., and Gangemi,
A., editors, Law and the Semantic Web, volume 3369
of Lecture Notes in Computer Science, pages 169–
184. Springer Berlin Heidelberg.
LemmaGen (2011). LemmaGen, multilingual open
source lemmatisation framework. http://lemmatise.
ijs.si/Services.
Maynard, D., Li, Y., and Peters, W. (2008). Nlp techniques
for term extraction and ontology population. In Pro-
ceedings of the 2008 Conference on Ontology Learn-
ing and Population: Bridging the Gap Between Text
and Knowledge, pages 107–127, Amsterdam, The
Netherlands, The Netherlands. IOS Press.
M¨uller, M. (2010). Fusion of Spatial Information Models
with Formal Ontologies in the Medical Domain. The-
sis.
NaturalLanguageProcessingTechniquesforDocumentClassificationinITBenchmarking-AutomatedIdentificationof
DomainSpecificTerms
365