Aerospace Information System based on Semantic Technonogies

and Ontology Management

A Web Portal for Semantic Search and Document Categorization

F. Gargiulo

, G. Zazzaro

, G. Romano

, G. Gigante

, A. Raggioli

and R. Fusco

Soft Computing Lab, Italian Aerospace Research Centre – Via Maiorise, Capua (CE), Italy

GruppoMeta, Via G. Porzio, 4 – Centro Direzionale, Napoli, Italy

Keywords: Aerospace Lexical Domain Ontology, Word Sense Disambiguation, Aerospace Taxonomy, Semantic

Search, Document Categorization, Aerospace Information System.

Abstract: This paper describes a semantic search tool based on our experience in using a new lexical domain ontology

for aerospace integrated with an open source general purpose ontology to support aerospace engineers in the

timely semantic retrieval of the knowledge. The semantic search module represents an integrated tool

dedicated to the semantic search, extraction and classification of information and knowledge in aerospace

domain. It describes the implementation of a disambiguation algorithm based upon these ontologies and a

new interesting graphical user interface for semantic searches is presented. Furthermore, next to the domain

ontology, a taxonomy for classifying aerospace documents is also proposed. The document classification

algorithm that leverages the deep integration between the proposed lexical domain ontology and taxonomy

is also described. Finally, some considerations about the usage of the semantic search module by the side of

domain experts, semantic experts or common users are reported.

1 INTRODUCTION

The growing demands of developing complex

information systems saving costs and guaranteeing

reliability leads to the adoption of different

paradigms facilitating knowledge sharing,

interoperability, completeness, and reuse. This paper

presents a new integrated tool for semantic search,

extraction and classification of information and

knowledge in aerospace domain. It is based on:

a) The proposal of a new lexical ontology and a

new taxonomy for aerospace domain;

b) The integration of both of them with open

source general purpose ontologies;

c) The implementation of a disambiguation

algorithm based on these ontologies;

d) The implementation of a classification

algorithm that leverages the deep integration

between the new ontology and the new

taxonomy;

e) A graphical user interface that allows natural

language queries and “by meaning” queries on

a very large number of documents (books,

papers, news, websites, etc.) both for experts

and common user.

The tool represents a subsystem of a wider

architecture that was implemented in SIA portal. As

described below, SIA – Sistema Informativo

Aerospaziale, Aerospace Information System

(sia.cira.it) – is software infrastructure for access,

retrieval and exploitation of technical, scientific

information for aerospace and high-tech user

community and related domains and added value

user services.

It comprises the following major subsystems:

I. Web Portal subsystem;

II. Semantic Search subsystem;

III. Linked Open Data subsystem;

IV. Document Warehouse subsystem.

This paper mainly focuses on the Semantic

Search subsystem and provides a only brief

description of the other subsystems.

341

Gargiulo F., Zazzaro G., Romano G., Gigante G., Raggioli A. and Fusco R..

Aerospace Information System based on Semantic Technonogies and Ontology Management - A Web Portal for Semantic Search and Document

Categorization.

DOI: 10.5220/0004994703410348

In Proceedings of 3rd International Conference on Data Management Technologies and Applications (DATA-2014), pages 341-348

ISBN: 978-989-758-035-2

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

2 RELATED WORKS

The ontology term is borrowed from philosophy,

where an Ontology is a systematic account of

Existence. In Artificial Intelligence context we can

describe the ontology of a program by defining a set

of representational terms. In particular, an ontology

is an explicit specification of a conceptualization of

a domain of interest (Gruber, 1993).

An ontology can be written for different tasks

since many domains may need a specific and formal

representation of knowledge:

Data Integration: the purpose is to integrate

heterogeneous information systems. Often different

databases retain the same type of information in

different patterns of data modeling. An ontology can

be used as mediator between database schemas,

allowing you to integrate information in different

patterns and to realize an interpreter between data

from two different sources.

Information Retrieval (IR): IR is the set of

techniques used for the recovery information in

electronic format. IR is the largest field of

application of ontologies because they improve the

accuracy of online searches by adding semantic

information which is useful to reduce the search

space.

Semantic Web: ontologies can be used to solve

various problems of heterogeneity of the Web.

Ontologies can enrich internal representation

(metadata) of meaningful semantic labels, can build

representations to model users with respect to their

information needs and build mechanisms of

mediation between metadata and information needs

of the user (to build custom interfaces).

There are different types of ontologies depending

on abstraction level (Guarino, 1998):

Top-level: ontologies with very general or

abstract concepts such as space, time, behavior,

action, etc. which are independent from specific

domains, so as to be useful for their reusability in

other ontologies. For this reason they are also called

Meta-Ontology. Such ontologies alone may have

little use but they are great for building knowledge

bases.

Domain and Task Ontology: this type describes

the vocabulary related to a generic domain (e.gg

aerospace, medicine, geography) or a generic

problem (e.g. diagnosis, configuration) and it can

specify concepts of a top-level ontology.

Application Ontology: this kind of ontology

describe concepts in a specific domain and the

problems derived from it.

The use of ontologies to provide a single and

shared representation of knowledge for all system

components has been largely motivated in literature

in the last decade: an interesting review of the state

of art of ontology-based software engineering can be

found in (Calero, 2005; Castañeda , 2010; Gasevic,

2009; Farfeleder, 2011), and the last recent

proceedings of international forums like SWESE,

W3C, SEKE discussing synergies between ontology

engineering and software engineering. Synergies are

discussed focusing on different key concerns.

The first concern is related to the development of

life cycle integrating the adoption of ontologies.

The second one proposes methods to develop

ontologies. Literature recognizes mainly two

approaches: the experience-based (Gómez-Pérez,

2004) and the “engineered” based which defines a

set of life cycle activities aiming at prototype

refinement (Uschold, 1996; Noy, 2001).

The third concern is related to the development

of ontologies. Literature proposes ontologies with

different richness of expressivity and to different

purposes. Lightweight ontologies are principally

taxonomies, they include concepts, relationships

between concepts, and properties describing

concepts. Heavyweight ontologies are those which

model knowledge and define restrictions on

domain semantics, by means of axioms and

constraints. Ontologies are developed to support the

development process, to support the knowledge

sharing of general information about “the real

world” and the application domain (medicine,

automotive, railway, aerospace) (Calero, 2005).

The fourth concern aims to develop a complete

framework proposing both new methodologies and

tools to guide the use of ontologies and to apply it to

each phase of software life cycle (Gasevic, 2009). In

the aerospace industry domain ontologies are a

constant in each approach but are rarely defined. An

important work describing a basic ontology for

aerospace is presented in (Malin, 2006) where the

basic concepts of functions, entities and problems

are defined. Specific ontologies are proposed to

support the justification of design, RaDEX (Kuofie,

2010), and the aerospace composite manufacturing

domain (Verhagen, 2011), to define UAV missions

(Schumann, 2012) and to support the intelligence,

surveillance, and reconnaissance (ISR) mission.

NASA addresses the use of ontologies in

different contexts. The CDXA program aims to

integrate knowledge in complex programs proposing

a constellation of ontologies (SWEET, 2011).

DATA2014-3rdInternationalConferenceonDataManagementTechnologiesandApplications

342

Recent European projects adopt sematic

techniques in aerospace domain. The EU CESAR

project bring innovations in the two most

improvable engineering disciplines: Requirements

engineering and Component-based design of

automotive, aerospace and railway (Bogusch, 2011).

3 DESCRIPTION OF THE SIA

PROJECT

SIA project activity has been conceived as a

strategic information tool in order to facilitate the

growth of aerospace knowledge in the Regione

Campania as it is oriented to the main aerospace

actor such as SME, Universities, Research Centers,

etc. This work has been carried on within the

research project SIA, funded by the Campania

Region and EU within the framework of POR

Campania FESR 2007 – 2013.

SIA is SW infrastructure for access, retrieval and

exploitation of technical, scientific information for

aerospace and high-tech user community and related

domains and for providing them added value user

services.

The objective of SIA project is the development

of a SW infrastructure able:

 to guarantee access to the most important

source of information related to the aerospace

domain (paper, technical report, e-books, e-

journals);

 to facilitate the exchange among users of

knowledge related to the research and

development activities in the aerospace

domain;

 to support user in the optimization of complex

activities such as certification task, e-learning,

etc.

SIA services are accessible through a vertical

web portal based on semantic features with

aerospace ontology and taxonomies with which SIA

system:

 make documents content more meaningful for

an efficient search and access of information;

 provide to users a relevant information as

result of a search avoiding the negative

experience of information overload or out of

scope;

 enables users to activate strategies for a more

efficient sharing and spread of domain

knowledge.

Furthermore, SIA web portal offer to end-users

the following functionalities:

1. user profile management for adaptive access

to SIA services and content;

2. advanced information retrieval able to more

exploit semantic information representative

both of documents contents managed by SIA

and user preferences;

3. semantic enterprise wiki in order to facilitate

information exchange among users and to

improve collaborative work;

4. Press and automatic News generation and

aggregation for a more wide information

spread.

In regard to 2 and 4 above, the user can define an

alert (i.e. a set of semantic queries) and she will be

notified when the system will index any document

that satisfies the defined alert.

SIA operational context is built upon the

following software subsystems conceived to satisfy

project requirements:

Web Portal Subsystem: SIA portal through

which services are made available to users (Search,

Browsing, Blog, Wiki, News, News Alert, e-Press,

Reference). Access to SIA web contents and

services depends on user profile regardless of user

device (PC, PDA, Tablet, etc.).

Semantic Search Subsystem: this component

guarantees features in terms of automatic retrieval of

predefined information sources, content filtering,

parsing, word disambiguation, data extraction and

correlation, data classification, indexing, data

storage.

Linked Open Data Subsystem: a triple store

based on Virtuoso and a SPARQ endpoint for

sharing information about the document indexed by

Semantic Search subsystem in RDF format

according to W3C best practice regarding semantic

interoperability.

Document Warehouse Subsystem: assures

loading and storage of structured information

generated in the Semantic Search Subsystem in a

document warehouse for further user analysis tasks

based on OLAP features.

4 THE SEMANTIC SEARCH

SUBSYSTEM

The Semantic Search subsystem, as mentioned

earlier, is characterized by the integration between

the proposed lexical ontology and open source

AerospaceInformationSystembasedonSemanticTechnonogiesandOntologyManagement-AWebPortalforSemantic

SearchandDocumentCategorization

343

general purpose ontologies joined to the

disambiguation and classification algorithms and the

graphical user interface. The following paragraphs

describe the features of each of these aspects.

4.1 Aerospace Lexical Domain

Ontology

The ontology development was inspired by the steps

identified by Pinto & Martins (Pinto and

Martins,

2004), which is roughly reflected in this section.

Specification: the aerospace domain ontology has

a twofold objective. From a broad perspective, the

purpose of this ontology is to fill a gap by

introducing a new domain ontology. In a more strict

sense, the purpose of the ontology is to support

knowledge management, allowing the indexing,

disambiguation, classification and search for

contents in aerospace domain. The scope of the

ontology is limited to its domain, and within this

scope, the emphasis lies on the concepts and

relationships of meaning among them.

Conceptualization: the applicable concepts and

relationships come from an amalgam of various

sources. The domain experts involved into the SIA

project taken very carefully into account the existing

ontologies (SWEET, 2010; Hannessen, 2003; CIRA,

2012). These ontologies have been studied in order

to capture the current state of the art.

Formalization: the lexical domain ontology is

available in two languages: Italian and English

language. The following semantic relations in both

languages are handled:

 Synonymy: it indicates the relationship

between two terms that have the same or

nearly the same meaning in aerospace domain,

such as “space” and “cosmo”;

 Hypernymy: the relation of being

superordinate or belonging to a higher (more

abstract) rank or class. Inverse of hyponym.

For instance, “tree” is hypernym of “oak” and

“poplar”;

 Hyponymy: used to designate a member of a

class. For instance, “Boeing 747” and “Airbus

A380” are hyponym of “Aircraft”;

 Meronym: a word that denotes a constituent

part or a member of something. For example,

“wings” and “engine” are a meronyms of

“Aircraft”;

 Holonym: the opposite of a meronym is a

holonym, the name of the whole of which the

meronym is a part.

Implementation: the domain ontology contains

7,497 Italian words, 5,962 Italian synsets, 5,750

English words and 5,127 English synsets. The Italian

and English version of the ontology share 4,344

multilingual relations. In addition, the Italian version

contains other 1,405 relations. The general purpose

multilingual ontologies used are actually: the

WordNet developed at the Princeton University

(Fellbaum, 1999) for English language and the

MultiWordNet for Italian language. For this reason,

a natural choice was store the domain lexical

ontology in the same database schema of

MultiWordNet and therefore a custom simple

ontology editor was developed. This editor allows to

manage concepts, synsets and relations in the

MySQL database schema. Also, the editor allows the

editing of the domain taxonomy, described below,

contained in the same database. A simpler version of

the editor has been included in the Web Portal

subsystem and it is available for administrative

tasks.

4.2 The Ontologies Integration

The lexical ontologies in the Semantic Search

subsystem are three: WordNet for English language,

MultiWordNet for Italian language and the

aerospace domain lexical ontology for both

languages. The WordNet contains about 117,000

synsets and the currently available release of

MultiWordNet includes information about 58,000

Italian word meanings and 32,700 synsets.

During the morphological analysis, the

disambiguation and indexing phase of an Italian

document, the tool can rely on the MultiWordNet

and the domain ontology to detect the concepts and

assign the meaning to the words in the document; it

uses the WordNet and the domain ontology for

English language instead. Before the disambiguation

phase, concepts that are compound words,

abbreviations or acronyms are detected. These

concepts are usually relevant in domain terminology

and their identification permits a more accurate

tokenization of the sentences. If a token is related to

a concepts belonging both general purpose and

domain ontology, the system performs a sort of

context analysis to determine if a general or a

domain specific meaning would to be assigned to the

word. The system tries to guess if the context is

strictly related to the domain analyzing the previous

and the following sentences and counting the

number of tokens related to the domain terminology.

If this number exceeds a fixed threshold then the

DATA2014-3rdInternationalConferenceonDataManagementTechnologiesandApplications

344

system chooses the domain meaning otherwise the

general meaning.

Even if the domain meaning was chosen, the

system also memorize the general meaning; this

information can be used during the disambiguation

of the adjacent token that do not belong to the

domain ontology.

At the end, the disambiguation algorithm assigns

an identifier to the token. Note that some token does

not have an identifier, in particular: the terms that

were not recognized, articles, conjunctions,

prepositions, etc. The identifiers follow the

MultiWordNet pattern and are formed by part of

speech followed by (#) character and a numeric

string consisting of 8 characters, e.g. n#00001234

(“n” means “noun”). Identifiers of domain

ontology to be unique in the overall system are

prefixed by “d” (domain) and a number. For

example d1n#00001234 represents a domain

concept. The number allows the simultaneous

presence of more domain ontologies and

distinguishes which of domain ontologies the

concept belongs (in this work there is only one

domain ontology).

At this point, the elaboration of a natural

language queries can be briefly explained. In fact

when the user inserts a sentence and executes a

natural language query, the sentence will be

processed in the same manner described above and

at the end of the elaboration the system knows which

concepts – and their ontologies - must search for. In

other words, the system tries to determine the correct

meaning of each word of the sentence from the

sentence itself and therefore the sentence represents

the context wherein the disambiguation is

performed. On the other hand, if the user inserts a

word and executes a “by meaning” query the system

prompts the user to select one or more meanings

among those present in ontologies, as described in

more detail later.

4.3 The Disambiguation Algorithm

The disambiguation algorithm adopted is an

implementation of a variant of the JIGSAW

algorithm for word sense disambiguation proposed

by University of Bari (Basile, 2007). For reasons of

space the algorithm will not be described here

(please, for the details refer to the original paper and

to the documentation of the University of Bari). In

this paper only the main changes occurred during the

implementation will be described. The changes were

aimed at improving integration with the other

components of the system and an improvement of

the performance.

The first adjustment involved the modules

assigned to morpho-syntactical analysis and

tokenization. The original system had its own

morpho-syntactic analysis module, this module has

been removed and replaced by two modules,

respectively: Gate for English language and

TreeTagger (Schmid, 1995) for Italian language.

With regard to the performance, it was noted that

the introduction of a caching mechanism runs the

algorithm about 5 times faster. This mechanism

helps to avoid re-running the analysis of a token if a

syntactic constructs (with equivalent terminology)

was analyzed above. Therefore, could be formulated

a conjecture about the high frequency of re-use of

terminology in very specialized domains documents

(such as aerospace).

4.4 The Classification Algorithm

A Bayesian model is trained in order to classify the

indexed documents in taxonomy categories, Table 1.

It is based on the Weka (Hall, 2009) implementation

of the Bayesian multinomial classifier named

NaiveBayesMultinomial (McCallum , 1998).

Table 1: The domain taxonomy.

Level 1 Level 2 Level 3

Aerospace

Aeronautics

Physics

Materials

Propulsion

Equipment

Design and Validation

Traffic Management

& Airports

Space

Physics

Materials

Propulsion

Equipment

Design and Validation

Ground Support &

Launch Operations

Sciences

During the training phase of the classification

model a standard training set based on an association

between documents and classification taxonomy

categories was not used. In fact such a kind of

training set requires a huge number of documents

manually tagged with the category. A different

approach instead was proposed; it is based on the

AerospaceInformationSystembasedonSemanticTechnonogiesandOntologyManagement-AWebPortalforSemantic

SearchandDocumentCategorization

345

presence of domain ontology and semantic

disambiguation system. As mentioned, the

disambiguation algorithm determines if a word can

be associate to a concept belonging to the domain

and this information can provide a significant

contribution to the attribution of a document to a

specific category. In fact, in the domain terminology

are often present terms closely related to some

categories, such as the name of the specific missile

propellant, on-board equipment, etc. The presence of

such terms often accurately directs the document to

a category and, at the same time, filters out

potentially noise resulting from the generic

terminology.

Then, the domain ontology concepts were

associated with the taxonomy categories with a

weight that represents the degree of membership. In

a number of cases it was not possible to create this

association because the concept is too general (i.e.,

"Flight") or the domain experts did not found the

association. About 2,880 domain ontology concepts

are associated to one or more taxonomy categories

and with these concepts the training sets – one for

each category – are built. In particular, the training

set of a category contains the concepts associated to

that category and the weight of the association

represents the label. When a document has to be

classified, the system detects all domain concepts

associated to one or more categories and submits

them to the classification model. Finally, the

classifier evaluates the degree of membership of the

document to every taxonomy category.

4.5 The Semantic Search

Semantic Search function is the core of SIA. User

can search documents through input query written in

natural language or keyword-based. Until now,

about 800,000 documents in both languages were

indexed and it has been designed in order to aid user

in the search of useful information and documents.

At this end, in the SIA system three search features

have been implemented:

Natural Language Search: an user can search

documents through input query written in natural

language. This search activates semantic

disambiguation of the user input text and the system

founds documents containing semantically

disambiguated terms consistent to the context

analysis performed on input text.

Lemma Search: SIA identifies the different

meanings of each user input text through a querying

into a general ontology (WordNet and

MultiWordNet) and a domain ontology (SIA

aerospace ontology). SIA shows an interactive

window with all possible meaning and relations

related to the search term. User can select a specific

meaning (lemma) and its semantic relationship with

other lemma in order to refine search. In SIA, this

kind of query is also called “by meaning”.

Keywords Search: traditional full-text search

performed on the basis of user query terms.

Whatever is the search feature selected by the

user, SIA returns search results grouped by

predefined facet: data source, format (html, pdf,

word, etc.), category (main aeronautical and space

taxonomy class), type (magazine, journal, etc.),

authors, keywords, domain entity. Furthermore,

search results are ordered according to a score

function evaluated on the basis of the Virtuoso

scoring algorithm. After the user selects a document,

the system redirects her to a detail page. In this page

there are also a list of similar document to the

selected one. As described previously, the Natural

Language Search lead back to Lemma Search

therefore the latter will be described in more details.

The user executes the following steps: selects

this kind of search, types a word and chooses the

language. At this point, the system will guide her in

the choice of the meaning or meanings of the word

that should be looking for. It displays a tree like the

one shown in the figure 1.

Figure 1: The GUI of lemma search in SIA.

All nodes of this tree, but the root, are concepts

contained in the general or domain ontologies.

On mouse move the system shows a tooltip for

each node with the exact definition of the concept

and the lemmas it contains, figure 2.

DATA2014-3rdInternationalConferenceonDataManagementTechnologiesandApplications

346

Figure 2: The tooltip with definition of the general

meaning of the term (the pink top node) “fuselage”.

The red node is the root of the tree and it

represents the term T that the user entered

(“fuselage” in figure 1). The children of the root

represents all the broad meanings of the term T. In

practice, for each child C of the root the term T is a

lemma of C (i.e., T belongs to the synset of C).

Moreover, if D is a child of C it means that there is a

relationship between C and D (in one of the two

ontologies) and the color of D represents the type of

relationship. As mentioned, the semantic relations

handled are: synonymy, hypernymy, hyponymy,

holonymy, meronymy; it is possible to think the

concept D as a "specialization" of the concept C (for

example, D is "part of" C or D is an hyponym of C).

Note that the term T can belong to both concepts

C and D. In this case, the root is connected only to

the node C because C is a more general concept of

Each node in the tree, but the root, can be

selected and the search will return documents in

which the term T is used in the meanings that

correspond to nodes/concepts selected.

The interpretation of the edge between the root

and its children is different with respect to the

interpretation of the edges the links a child of the

root to its children. In the first case, the children of

the root are the broad meanings of the term T. At

this level the user manually disambiguate the term

and the color of the child indicates the respective

ontology, whereby the user can decide whether to

continue the search on only one of the ontologies, or

on both.

The other edges always represent semantic

relations and the color of the children of a child of

the root specifies the kind of relationship. At this

level, the user refine the meaning of the concept that

must be searched for.

This type of representation makes it possible to

distinguish the concepts that belong to the domain

ontology from those that belong to the

MultiWordNet. In fact, the concepts of domain

ontology have different colors are drawn in a

different side of the screen than the general

ontology, figure 1. The layout used to draw the tree

will tend always to separate the nodes of an ontology

from those of the other. The number of nodes in the

domain ontology compared to the number of nodes

of the general ontology provides intuitively a

measure of how much the current search is relevant

with the aerospace domain.

This graphical user interface can be used by

expert users who are familiar with the use of

semantic relations; by user who are familiar with

domain terminology; by common users.

For the first group of users, the GUI lets them

select the semantic relations while for the other

groups of users, they simply selects the meanings of

a term and can freely ignore relations. In any case, it

is clear that the lower nodes of the tree the more

restrictive and specific meanings is associated to the

term.

5 CONCLUSIONS AND FUTURE

WORKS

This paper describes an integrated SW tool aimed to

semantic search, extraction and classification of

information in the aerospace domain. A new lexical

aerospace domain ontology is proposed. The tool is

based on the integration between lexical ontologies

and algorithms operating on them and, in order to

get a better user experience, a GUI is also proposed.

Experimental results regarding the performances

of the tool was obtained in two ways. Precision and

Recall measure (Davis, 2006) were firstly calculated

on a test set consisting of 50 documents. The results

were approximately the same in (Davis, 2006). Due

to the small number of documents, these measures

were not been interpreted as a measure of the

performance of the disambiguation algorithm but

rather as a confirmation that the changes introduced

(i.e., Gate, TreeTager and caching) did not worsen

the original algorithm. On the other hand, the

response of domain users who have followed the

experimental phase, represented a very positive

qualitative assessment of the tool. In order to obtain

consistent experimental results about the global

performance of the described semantic search

module it is necessary a comparison between similar

systems developed for the aerospace domain that

expose similar functionalities (lemma search, natural

language search, etc.). Also, the employed set of

ontologies plays a central role in the performance of

the each system and how effective is a comparison

among systems based on different ontologies is not a

trivial matter. On the other hand, for the

disambiguation algorithm used in semantic search

AerospaceInformationSystembasedonSemanticTechnonogiesandOntologyManagement-AWebPortalforSemantic

SearchandDocumentCategorization

347

module it is possible to refers to the experimental

results provided in (Basile, 2007). In particular, the

disambiguation algorithm has been evaluated by

SemEval-2007 task. The algorithms were scored

according to standard IR/CLIR measures as

implemented in the TREC evaluation package

(http://trec.nist.gov/).Future works will aim to the

maintenance of the lexical ontology. Updating and

correcting the implemented ontology will be

achieved by the publication of the ontology editing

functions and the preparation of a controlled change

management process for the approval of changes

suggested by users.

REFERENCES

Basile, P., de Gemmis, M., Gentile, A. L., Lops, P.,

Semeraro, G., 2007. UNIBA: JIGSAW algorithm for

word sense disambiguation. Proceedings of the 4th

International Workshop on Semantic Evaluations (pp.

398-401). Association for Computational Linguistics.

Bogusch, R., Gerlach, S., 2011. Optimierungen im

Requirements-Engineering in der Praxis.

Calero, C., Ruiz, F., Piattini, M., 2005. Ontologies in

Software Engineering and Software Technology.

Springer.

Castañeda, V., Ballejos, L., Caliusco, M., Galli, M., 2010.

The Use of Ontologies in Requirements Engineering.

Global Journal of Researches in Engineering, Vol. 10

Issue 6.

CIRA, 2012. Thesaurus: descrittori logici delle attività

aerospaziali. Italian Research Aerospace Centre

internal technical document.

Davis J., Goadrich M., 2006. The Relationship Between

Precision-Recall and ROC Curves. Proceedings of the

International Conference on Machine Learning,

Pittsburgh, PA.

Farfeleder, S., Moser, T., Krall, A., 2011. Ontology-

Driven Guidance for Requirements Elicitation.

ESWC, Part II, LNCS 6644, pp. 212–226 Springer-

Verlag Berlin Heidelberg.

Fellbaum, C., 1999. WordNet. Blackwell Publishing Ltd.

Gasevic, D., Kaviani, N., Milanovi, M., 2009. Ontologies

and Software Engineering. International Handbooks

on Information Systems, pp 593-615 Springer.

Gómez-Pérez, A., Fernández-López, M., Corcho, O.,

2004. Ontological Engineering. Springer, New York.

USA.

Gruber, T. R., 1993. Toward Principles for the Design of

Ontologies Used for Knowledge Sharing. International

Journal Human-Computer Studies 43, p.907-928.

Guarino N., 1998. Formal Ontology and Information

Systems. Proceedings of FOIS’98, Trento, Italy.

Amsterdam, IOS Press, pp. 3-15 11.

Hall, M., Frank, E., Holmes, G., Pfahringer, B.,

Reutemann, P., Witten, I. H., 2009. The WEKA data

mining software: an update. ACM SIGKDD

explorations newsletter, 11(1), 10-18.

Hannessen, D. P., Donker, J. C., 2003. ACARE Taxonomy

A common European taxonomy for aeronautical

research & technology.

Kuofie, E. J., 2010. RaDEX: A Rationale-based Ontology

for Aerospace Design Explanation. Master of Science

Programme Business Information Technology

University of Twente. http://essay.utwente.nl/59926/

Malin, J., Throop, D., 2006. Basic Concepts and

Distinctions for an Aerospace Ontology of Functions,

Entities and Problems. Aerospace Conference, IEEE.

McCallum, A., Nigam, K., 1998. A Comparison of Event

Models for Naive Bayes Text Classification. AAAI-98

workshop on learning for text categorization (Vol.

752, pp. 41-48).

Noy, N., McGuinness, D., 2001. Ontology development

101: A guide to creating your first ontology.

Pinto H. S., Martins J. P., 2004. Ontologies: How can they

be built? Knowledge and Information Systems, Vol. 6,

No. 4, pp. 441–464.

Schmid, H., 1995. TreeTagger| a Language Independent

Part-of-speech. Tagger.Institut für Maschinelle

Sprachverarbeitung, Universität Stuttgart, 43.

Schumann, B., Scanlany, J., Fangohrz, H., 2012. A

Generic Unifying Ontology for Civil Unmanned

Aerial. 12

AIAA Aviation Technology.

SWEET Ontologies - Nasa, 2011. URL:

sweet.jpl.nasa.gov/ontology.

Uschold., M., 1996. Building ontologies: towards a

unified methodology. Proceedings of 16th Annual

Conference of the British Computer Society

Specialists Group on Expert Systems.

Verhagen, W. J., Curran, R., 2011. Ontological modelling

of the aerospace composite manufacturing domain.

Improving Complex Systems Today (pp. 215-222).

Springer London.

Bloehdorn, S., Hotho, A., 2006. Boosting for text

classification with semantic features. Advances in

Web mining and Web usage Analysis (pp. 149-166).

Springer Berlin Heidelberg.

Song, M. H., Lim, S. Y., Kang, D. J., Lee, S. J., 2005.

Automatic classification of web pages based on the

concept of domain ontology. Software Engineering

Conference, 2005. APSEC'05. 12th Asia-Pacific (pp.

7-pp). IEEE.

Wu, S. H., Tsai, T. H., Hsu, W. L., 2003. Text

categorization using automatically acquired domain

ontology. Proceedings of the sixth international

workshop on Information retrieval with Asian

languages-Volume 11 (pp. 138-145). Association for

Computational Linguistics.

DATA2014-3rdInternationalConferenceonDataManagementTechnologiesandApplications

348